Alessandro Fonda 


A Modern 
Introduction to 
Mathematical 
Analysis 


® Birkhauser 


A Modern Introduction to Mathematical 
Analysis 


Alessandro Fonda 


A Modern Introduction 
to Mathematical Analysis 


® Birkhauser 


Alessandro Fonda 

Dipartimento di Matematica e Geoscienze 
Universita degli Studi di Trieste 

Trieste, Italy 


ISBN 978-3-031-23712-6 ISBN 978-3-031-23713-3 (eBook) 
https://doi.org/10.1007/978-3-031-23713-3 


© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland 
AG 2023 

This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether 
the whole or part of the material is concerned, specifically the rights of reprinting, reuse of illustrations, 
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or 
information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar 
methodology now known or hereafter developed. 

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication 
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant 
protective laws and regulations and therefore free for general use. 

The publisher, the authors, and the editors are safe to assume that the advice and information in this book 
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or 
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any 
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional 
claims in published maps and institutional affiliations. 


This book is published under the imprint Birkhauser, www.birkhauser-science.com by the registered 
company Springer Nature Switzerland AG 
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland 


To my parents, Thea and Luciano 


This book brings together the classical topics of mathematical analysis normally 
taught in the first two years of a university course. It is the outcome of the lessons 
I have been teaching for many years in the undergraduate courses in mathematics, 
physics, and engineering at my university. 

Many excellent books on mathematical analysis have already been written, so 
a natural question to ask is: Why write another book on this subject? I will try to 
provide a brief answer to that question. 

The main novelty of this book lies in the treatment of the theory of the integral. 
Kurzweil and Henstock’s theory is presented here in Chaps. 7, 9, and 11. Compared 
to Riemann’s theory, it requires modest additional effort from the student, but that 
effort will be repaid with significant benefits. Consider that it includes Lebesgue’s 
theory itself, since a function is integrable according to Kurzweil and Henstock if 
and only if it is integrable according to Lebesgue, together with its absolute value. In 
this theory, the Fundamental Theorem turns out to be very general and natural, and 
it finds its generalization in Taylor’s formula with an integral remainder, requiring 
only essential hypotheses. Moreover, the improper integral happens to be a normal 
integral. 

Despite the modest additional effort required, no demands for preliminary 
knowledge about the integral will be made on the student for the purpose of reading 
this book. Students will be guided as they construct all the necessary mathematical 
tools, starting from the very beginning. 

Indeed, the book starts with some preliminaries on logic and set theory. It is 
a short vademecum, without formal rigor, and will help readers orient themselves 
with respect to the notations to be used later on. 

In Chap. |, we introduce the main sets on which we base the rest of the theory. 
These are the numeric sets, mainly R and C, the space RY, and the metric spaces. It 
is in this general context that the concepts of continuity and limit will be developed 
in Chaps. 2 and 3, respectively. The discussion of numerical series, together with the 
series of functions, will be postponed to Chap. 8. 

Chapter 4 is dedicated to the notions of compactness and completeness. Although 
they seem to be rather abstract concepts, they happen to be necessary for a rigorous 
treatment of differential and integral calculus. 

I would like to make special mention here of the original construction of the 


exponential function and the trigonometric functions, which I propose in Chap. 5. 
vii 


viii Preface 


They are introduced as particular cases of a function with complex values that is 
constructed with elementary geometric tools. 

Differential calculus is first developed in Chap.6 for functions of one real 
variable and, later in Chap. 10, for functions of several variables. Here the reader 
will find the implicit function theorem proved by induction, as in the original proofs 
by Dini and Genocchi—Peano. 

As was already noted, integral calculus is presented in Chaps. 7, 9, and 11. This 
approach to the integral was introduced independently by J. Kurzweil [5] in 1957 
and R. Henstock [3] in 1961. 

In Chap. 12, the theory of differential forms and their integral on M-surfaces 
and on differentiable M-manifolds is developed in detail, following the approach 
of Spivak [6]. The Stokes—Cartan theorem, with its classic corollaries the curl 
and divergence theorems, is the final result. Besides being a fundamental tool in 
applications, it stands out for its elegance and formal perfection, like the most 
sublime works of art. 

The book can be used at different teaching levels, in line with the preferences 
of the teacher. As mentioned earlier, I have proposed it as an early postsecondary 
text. However, it could also be used in an advanced course in analysis or by 
scholars wishing to understand the Kurzweil—Henstock integral starting with a 
simple approach. 

Unlike the majority of books on these subjects, this one contains almost no 
exercises. The reason for this is that many textbooks containing only exercises have 
already been published, which fits perfectly with the arguments of this book. An 
example is Solving Problems in Mathematical Analysis by T. Radozycki, published 
by Springer in 2020, which is divided into three volumes: 


— Part I. Sets, Functions, Limits, Derivatives, Integrals, Sequences, and Series; 

— Part II. Definite, Improper and Multidimensional Integrals, Functions of Several 
Variables, and Differential Equations; 

— Part III. Curves and Surfaces, Conditional Extremes, Curvilinear Integrals, 
Complex Functions, Singularities, and Fourier Series. 


A list of other recent textbooks with solved exercises is provided in the bibliography. 

Finally, I would like to mention that this book would never have written without 
the strong motivation provided by my students. I thank them all, and I hope that the 
book will encourage others in the future to become involved in such a beautiful and 
fruitful theory as mathematical analysis. 


Trieste, Italy Alessandro Fonda 


Contents 


PartI The Basics of Mathematical Analysis 


1 


Sets of Numbers and Metric Spaces .......................0cceee eee eeeeee 3 
1 The Natural Numbers and the Induction Principle ................ 3 
Lil Recursive Dennis 2.252 sese..eepeeceiseeacvecenadeve sec 4 
Lil2  Preots by Wena 095052 so che ieeek Seles ceeds ad 6 
Li - The Binomial Porc c.0...se.s ee casdeaseveseecsaedevsses 9 
12 Whee eal NOMS 2.3 oy sds ese diet eed sot ees as eevee cei eees 12 
V2.1 Supremun and Iniimwi..... 0052.50 20. ec¢sesvecwe swe sensces 14 
Lee. “Tie Sqmare RO soi jeent ech ee es ned ed Rate od SS 17 
V2.3 - UO EVRLS occ sec pace raicsanes tense sa shaw ements owees tere ee 19 
1.24. Properties of Qand BO) oc. cs.0 ct eee eseseeieece besten 20 
13 The omplex NWUBEtS ...ccshccsi02; ceria stp secs iiseseveseradeseses 23 
1.3.1 Algebraic Equations in ©... oo06 2 iseccs bits eck icieecic ees 25 
1.3.2 The Modulus of a Complex Number ...................... 26 
1.4 Te te soo. ctisat divers degre niece 28 
1.4.1 Euclidean Norm and Distance ...................... eee 30 
15 Mieinic Spaces... saiiiveaiave dia e hiss cis teaworis bee enka nen 33 
A OMEINNEY bois c sieccaesi grec ewaen gee eran sane tecuaea raeaeaesndwwe adesemiemnnees 41 
2,1 COntnHOUS PUMCUHONS. <2 0. c.ciccsioccsacsaeccseesdeayeosscaae cess cee 41 
2.2 Intervals aid COMMING + con gee ssa esen cone din deobermesomeennn den 50 
2d Monotone FUNCHONS : 2.0: cc cica cc ceed egies cise ded dnsd Seba deen eaie eS 51 
2.4 The Exponential Puacno s:...c0d00ccauwtoaccemsbenbomtasondtanes oe 53 
Lud The Trigonomettic FUnctions: ......260 vs cccsec sess eens ces sens sees 56 
2.6 Other Examples of Continuous Functions ......................065 59 
MAGS 5d ose sineccayee tee bee os oetenasdectonvatebaeeataededeaaseee se teseunees 63 
351 Phe Nomen ot LIOnt 6.9.5 2ii4d ies Se bested ete see ee dees 63 
3.2. sone Properties Of Limits ........0.ss0es.02p secs ses escvecenaseveces 65 
3.3 Change of Variables in the Limit... ........5.. 050.550 .ces ees e ees 68 
3.4 nr this Paina OF IRESINICIIONE 6 sei... eens cepa eeeues vsseee eee des ese 70 
Eis, Whe Pxtended Real Line... 2445.92 3.508ed Ste nde sees diodes 72 
36 some Operations with —o0 and. 4-00 «2.20: .scceiessscte sees sees 76 
a Limits of Monotone Punctions .............5.5 0.0... esse cence ress 83 


Contents 


3.8 Limits for Exponentials and Logarithms........................045 86 
34 Laman and Lagu 52 o.s.cc2se ede yesaans ecten ead bess eesetedemess 90 
Compactness and Completeness ...................: ccs eeeeeeeneeeeene eens 93 
4.1 Some Preliminaries on Sequences ......0.0..60 esses eee eee enes ee 93 
4.2 COMPAct SES... coc i sc es dea ce sieved egies nlge Meade se eebe dasa eaeeees 95 
4.3 Compactness and Contimiity «0.20.00... c cece ces enbs eee sees 97 
4.4 Complete Metric Spaces... ic: ccascddsccasccacsageese se saeeo sees 99 
4.5 Completeness and COnmmmiiy 5 cic escc cone den jioesemas sontemesen 102 
4.6 spaces of Continous PUnciOns ....<s.s.c2cccs devdess ca bedeaseade cas 103 
Exponential and Circular Functions ...................... 0 cess eee enna eee 107 
| Whe COWSHMCHOM .cyooc.cccpieasiastyaieisteeereiesveckesenseenedas 107 

Slt Prelummariés forthe Poet... 342325. eis bess ochd ceases c 108 

5.12 Denotion ena Dense Seb c...c.scsi esses vscnesesescvescs 113 

5.1.3 Extension to the Whole Real Line .........-5...00.-¢485085 114 
a2 Exponential and Circular Punctions ..0...3.sescsscnssesescteeene ees 117 
5.3 Limits for Trigonometric Functions..................... se eeeee eee 119 


Part II Differential and Integral Calculus in R 


6 


The Derivative 25 cc: cess badd chee eed cbt oe bed deeds wena Es nesg cen 088 127 
6.1 sone Ditrerentiaticn RUNES. «2c. .s.v.ccsescsievesinestnese te sevess 130 
6.2 ‘The Derivative PUmenmon +. 2.04228 betes ed dee wak ising cee sera een 135 
62 Remarkable Properties of the Derivative ......................0045 136 
6.4 Inverses of Trigonometric and Hyperbolic Functions............. 139 
6.5 Convex tty dind (OGneavity. os. sc5cces iecdiees paves esteeeet scenes 141 
6.6 EHiaprtal's Rules: ..2.i.2529; aod sigs Badan peed Sige iS ehoecs eee den 146 
6.7 Payor POCile .os.0..eesscay eens ieuyeese is tyedeeies vache beceeenedas 154 
6.8 Local Maxima ard Mimumia 3.06.4 oo 45 5) belie eid ceis e255 acts ek 2k 158 
69 Analyticity of Some Elementary Functions ....................... 159 
Wie UNO al oi oie cs va end ceeiia aise ecasedseerascc nase etses Saokeareess 161 
7.1 BUSAN SW 85 peep cexenaweisan vaebiineedandioteumescameennees 161 
ae OrP ine Taeeed Pamone oii. iss oc seers ins ose deddass abedanseraeees 163 
qo Integrable Functions on a Compact Interval ....................... 165 
74 Elementary Properties of the Integral ........................ eee 168 
fe) The Pundamental ThEGfem :..o.. 020500060. econ sc okeene ssa eeemanee 170 
7.6 Prmiivable PURCHONS «...261.cs.cic5 ccdpcin- oi seacaduss ease sasa eee ten 172 
ast Primitivation by Parts and by Substitution......................... 177 
7.8 The Taylor Formula with Integral Form Remainder .............. 181 
7 he Cage hy Cameron osc. siete edewe snecinined paeemes corte de 182 
t10 Inteorability on Submiervals: . 3.0.0 csccassicssecdesseess cacsesbedss 184 
7.11 R-Integrable and Continuous Functions .......................0085 187 
7.12 Two Theorems Involving Limits ........... 0.0.0... cee ee ee eee eee 191 
7.13 Integration on Noncompact Intervals ................... 0. eee eee 194 


7.14 Functions with Vector Values ............. ccc ce cece eee eee ceeeeeees 198 


Contents xi 
Part II Further Developments 
8 Numerical Series and Series of Functions ............................05. 203 
3.1 Introduction and First. Properties .... 2 ....0502 500. sees sen eces eens 203 
be pertes OF Real Numbers... .2..02.6.52255.46oeieeks acts eel aces eck aed 208 
8.3 pelies of Complex Numbers... 26.2025. scseevesisescnesetesevsens 213 
8.4 Series OF PUNCMONG isc. 50555 1405 oe99 24 hosed ida deand sees eee dea ees 215 
Sad \POWEr See ccs daceiecsaacezecsdseecaceecsesenesacecece ees 217 
8.4.2 The Complex Exponential Function ....................... 220 
SA.> “Taylor SQres:. cov. cesiacsaxesietieciwerigayerwessteceneces 222 
S44 Fourier Setves-.....6 cts caas sie aad Gieen dees beds eee eae des 224 
8.5 senes and Mtearale: ..cccsssvencsieepvesstatpedesiasesehetensgengess 231 
9 More onthe Integral. ...0.2c0..0cig0scae nice gees cide ganda se eeeadenaeeee den 233 
9.1 Salts—Henstock THEO «25.2345 orien eseie cs cov eepedaeeecceranses 233 
2 E-Intestable PUMOWONS 2 ..c. cise eecscc eiaseendes sense veoeeaenans 236 
9.3 Monotone Convergence Theorem... 0.2 .c.0cc.i eerie caeeeceses 241 
9.4 Dominated Convergence Theorem: 2.25 02 s.ci.s0c0secessessaees 245 
95 ASS TGC oo esncvsiwrsacvancvsiiasna aacvaabriencodeninbswnbes 247 
PartIV Differential and Integral Calculus in RY 
I) “The Differential... oo... cscs eee ie deden eens tien degen dnsewnewaen ed 255 
10.1 The Differential of a Scalar-Valued Function ..................... 256 
10.2 Some Computational Rules... 0.00. .06 0256. ss0.sseeeeee scares 259 
103 ‘Twice Dillerentiable Funchons:......c5.5..0 005 51035005000 08008094 tes 261 
4 Vag lor Pan os ase cane cacantas cw sonacaareumnentaniaadacees 263 
10.5. The Search for Maxima and Minima ....................... eee eee 267 
10.6 Implicit Function Theorem: First Statement....................... 269 
10.7. The Differential of a Vector-Valued Function ...................4. 272 
Me «Whe hatin Ble ator ae sand anata dn daeahdnes does cone ieee 274 
1D Deleai Wali PCO os seaseessinacscaas cadence se eese doe nies dee 277 
10.10 Implicit Function Theorem: General Statement ................... 280 
10.11 ‘Local Ditteomotplisms:. <<cc5 dias cccedcassigseertess eebs dasnenaa des 286 
WU S SS MUMACES ccccanuneaseaddavanevainvaleas aa eae vawtored cpdeombinntods 287 
10.13 Local Analysis of M-Surfaces ............. cece eee cece center eens 291 
114. Lapranes MaIpISrS: 2: ccccsceclsseceawtdasesnnenebeancsseecemaeen 294 
10.15 Ditterentiable Mamilolds: «...0..0.0:.6.65¢ cc cece nce ce bean geee teen oon 299 
OL “The Intesrall 2 cyscicc cess eise ge igs ions Pk ek at bee eee seed cee ese 301 
Li. Intesrability on Rectangles... 2.022: :02..0s;s0tb ses escveseesseseees 301 
112 Integrability ona Bounded Set .....c.25520.:0 0. bcse ccc. eis sess oe 305 
We S RIC ORNS os cc secoa actions ase uis cad caeeneae ace aeocnseateseneces 307 
Lia - Negheible ete v3.2 ct scctsacds ete ee led sided SS ee es oases 310 
11.5 A Characterization of Measurable Bounded Sets ................. 313 
11.6 Continuous Functions and L-Integrable Functions................ 317 


xii Contents 
11.7. Limits and Derivatives under the Integration Sign ................ 319 
11S Reduction Pormla...i0.ci5.sieeicecsaeus sees aadesgtyseveesaedesese 324 
11.9 Change of Variables in the Integral ........0.5..0c0.sc00000-0 0005 333 
11.10 Change of Measure by Diffeomorphisms.......................... 340 
11.11 The General Theorem on Change of Variables .................... 342 
11.12 Some Useful Transformations in R? ................0ccceeeeeeeenes 344 
11.13 Cylindrical and Spherical Coordinates in R? ................00000- 348 
11.14 ‘The Integral on Unbounded Sets . ........5...02) sess serecresseve ees 350 
L115 ‘The Integral an i -Surtaces .. 225.2 cecsseeste th bess ccksceie cess oes 360 
11.16 f¢-Dimenstonal Measure... ...... 02. ..e5s ces seees es voovesseeses sees 363 
WAS Men eth Atel And iincc. yes c ks b esse ia oleae ese tees ose 365 
11.18 Approximation with Smooth M-Surfaces .....................000. 369 
11.19 The Integral on a Compact Manifold .................... eee 370 
2 Differential Forms: «0.02... .c... ccc cee cca ee cen cess snbeceeeedmsewmennenes 375 
12.1 An Intotmal Denaiom «2:52.50. vscecses veeces ck daa edese siaee ce edes 376 
122 Alpebeaie OperauiOns 268 ccccwsccnsentgentgeeedindaebeawescaeeennees 378 
12.3 The Bxtenor Diilerential so2c5 c.scccdecas cigs ded inse eebs dasneeaades 380 
12.4 Differential Forms in R?...............cccccccneceeceeeeeeeeeeeenees 382 
12.5. The Intepral on an M-Simtace oc csice cscs seen ecassen es cee 384 
126 Pull-Back Translortiation «2.002. 605.cccccsi. coarse keene sseecemnnes 389 
12.7 Oriented Boundary of a Rectansle:... 2.0... 6.cc.s.0scesecssseaeeees 392 
V2.8 Aas POMBNG ssc becd cance daneadnaeoliiedend don edbbsnten nies es 395 
12.9 Oriented Boundary of an M-Surface................ eee cece 397 
12,10 Stokes—Cartan Pormila .. ...0s6 ccc eceenseeedincaebeeinessaeeniies es 401 
12.11 Physical Interpretation of Curl and Divergence ................... 406 
12.12 The Integral on an Oriented Compact Manifold................... 408 
12.13. Closed and Exact Differential Forms ..........................0085 411 
12.14 On the Precise Definition of a Differential Form.................. 418 
BU RADY 3 :oissc crise piecedc tiedennn chewed ttuer aun eanqadmmeamiadonlnbewamees 425 
WW EK: oo dosacs ens peiessoe pegesane veda pase piaciane pheesaaapeegrseueonseceseacke ses 427 


Preliminaries 


This preliminary section has as its goal to introduce the main language and notations 
used in the book. Logic and set theory are treated in an informal way, without aiming 
for the highest mathematical rigor. Indeed, a rigorous treatment would require a 
solid background in mathematics, which students just starting out in their college 
career will not usually possess. 


The Symbols of Logic 
In mathematical language, we usually deal with propositions, indicated by P, Q, 
and so forth. Moreover, we are accustomed to combining them in different ways, 
for example, 

P and Q, P or Q, P= Q, P<sQ. 
Let us explain the meaning of these. We start with 


P and QO. 


It is true if both P and Q are true; otherwise, it is false. We can draw a table where 
all four cases are exemplified:! 


' In all tables, T means that a proposition is true, F that it is false. 
xiii 


xiv Preliminaries 


Let us now consider 
Por Q. 


It is true if at least one of the two is true and false when both P and Q are false. 
Here is the corresponding table: 


Let us now analyze 
P=>Q. 


It is false only when P is true and Q is false; in all other cases it is true. Here is its 
table: 


Let us conclude with 
Ps Q. 


It is true if both are true or if both are false. Otherwise, it is false. And here is its 
table: 


It is very important to be able to logically deny a proposition. The negation of P 
will be denoted by —P (read as “non P”’): it is true when P is false, and vice versa. 


Preliminaries 


For example, we have the following De Morgan rules: 


=(P and Q)_ isequivalentto -—P or =Q, 
=(P or Q)  isequivalentto =P and -=Q. 


It is possible, moreover, to verify that 
P= Q_isequivalentto =P or Q. 
Consequently, 


=(P => Q)_isequivalentto P and -Q. 


Logical Propositions 


XV 


Our propositions will often involve one or more “variables.” For example, we could 
write them as follows: P(x), which contains the variable x, in which case we will 


typically find the following two types of propositions. The first one, 


Vx: P(x), 
means 
“for every x one has that P(x) is true.” 
The second, 
dx: P(x), 
means 


“there exists at least one x for which P(x) is true.” 
Let us see how their negation can be formulated. One has that 


a(Vx: P(x)) isequivalentto Jx: —P(x) 
and 


a(ax: P(x)) isequivalentto Vx: —P(x). 


To be more precise, these x will be assumed to be the elements of some set. Thus, 


this leads us to a brief review of the theory of sets. 


xvi Preliminaries 


The Language of Set Theory 
First Symbols 


We are more or less familiar with some numerical sets like, for example, 
N, the set of natural numbers; 
Z,, the set of integer numbers; 
Q, the set of rational numbers; 
R, the set of real numbers; 
C, the set of complex numbers. 
Their nature will be further studied as we progress through the book, and several 
other sets will be introduced later. To treat sets correctly, we need to develop a 
proper language. This is why we will now introduce some symbols explaining their 
meaning. 

Let us first introduce the symbol € . Writing 

aeA 

means “a belongs to the set A” or “a is an element of A.” Its negation is written 
a ¢ A and reads “a does not belong to A” or “a is not an element of A.” 

For example, let A = {1, 2,3} be the set” whose elements are the three natural 
numbers 1, 2, and 3. We clearly have 

1eEA, 2EA, 3€A, 
whereas 
1 
4¢A, a EA; TEA. 
Let us now present the symbol C. We will write 


ACB 


and read “A is contained in B” whenever every element of A is also an element of 
B. In symbols, 


xE€A => xeB. 


For example, if, as previously, A = {1, 2, 3}, we have that A CN, but also A CR. 


2 In this example, the set A is defined by listing its elements, which are finite in number. 


Preliminaries xvii 


If A C B, we also say that “A is a subset of B,’ and we can also write B D A. 
The negation of A C B is written A Z B or B Z A, and we read this as “A is not 
contained in B” or “B does not contain A.” 

We say that two sets A and B are “equal” if they coincide, i.e., if they have the 
same elements; in such a case we will write 


A=B. 
Therefore, 
A=B <&© ACBand BCA. 
The negation of A = B is written as A # B; in this case, we say that A and B are 
different, i.e., they do not coincide. 
Let us emphasize the following “order relation” properties: 

e ACA; 

¢ ACB and BCASA=B; 

* ACB and BCCSACC. 


We end this section by introducing a very peculiar set, the “empty set,” which is 
a set having no elements. It is denoted by the symbol 


©. 
It is convenient to consider @ as a subset of any other set, i.e., 


@CA, foranyA. 


Some Examples of Sets 
Let us begin with the simplest sets, those having a single element, for example, 
A= {3}, A= {N}, A = {Q}. 
The first one is a set having the number 3 as its only element. The second one has 
a single element N, and the third one has only the element @. We thus observe that 
the elements of a set may be other sets as well. We could have sets of the type 
A= {N,Q}, A ={9,Z,R}, A = {{7}, {1, 2, 3}, N} 


and of the type 


A = {3, {3}, N, {N, Q}}. 


xviii Preliminaries 


In this last case, one must be careful with symbols: we see that 3 € A, hence {3} C 
A, but also {3} € A, with {3} being an element of A. 
Let us also consider, as a last example, the set 


A = {O, {O}}. 

We have that @ € A, since @ is one of the elements of A, and hence {@} C A. But 
we also have that {@} € A, being {@} an element of A, and hence {{@}} C A. We 
also recall that @ C A, since this is true for every set. 
Operations with Sets 
It is normal practice to choose a “universal set” where we operate. We will denote it 
by E. All the objects we will speak of necessarily lie in this set. 

We define the “intersection” of two sets A and B: it is the set 

AN B={x:xe€eAandxe B}, 

whose elements belong to both sets. Notice that the intersection could also be the 
empty set: in that case, we say that A and B are “disjoint.” 

On the other hand, the “union” of two sets A and B is the set 


AUB={x:xeEAorxe B}, 


whose elements belong to at least one of the two sets, and possibly also to both. 
The “difference” of the two sets A and B is the set 


A\ B={x:x eAandx ¢ B}, 


whose elements belong to the first set but not the second. In particular, the set E \ A 
is said to be “complementary” to A and is denoted by CA. Hence, 


CA={x:x € A}. 
The following De Morgan rules hold true: 
C(ANB)=CA UCB, C(AUB)=CANCB. 
The “product” of the two sets A and B is the set 


Ax B={(a,b):a€A, be B}, 


3 Here the sets are defined by specifying the properties that their elements must possess. 


Preliminaries xix 


whose elements are the “ordered couples ” (a, b), where at the first position we have 
an element of A and at the second position an element of B. 


The Concept of Function 


A “function” (sometimes also called “‘application”) is defined by assigning three 
sets: 


e set A, the “domain” of the function; 

¢ set B, the “codomain’” of the function; 

*« set G C A x B, the “graph” of the function, having the following property: for 
every a € A there is a unique b € B such that (a, b) € G. 


A function defined in such a way is usually written f : A — B (read “f from A to 
B”). To each element a of the domain we have a well determined associated element 
b of the codomain: such a b will be denoted by f(a), and we will writeat f(a). 


We thus have that 


(a,b)eG & b= f(a), 


G={(a,b)€ Ax B: b= f(a} ={(a, f(a)):a€ A}. 


For example, the function f : N > R, defined as f(n) = n/(n + 1), associates 
to everyn € {0, 1, 2,3,...} the corresponding value n/(n + 1), 1.e., 


n 
ne —. 
n+1 
We will thus have that 
0 0) 1 : 2 2 3 
a ie ey Fro Sy 
pias ma) 3 4 


Note that the values of this function are all rational numbers, so we could have 
defined using the same formula a function f : N — Q. Such a function, however, 
is not the same as the previous one since they do not have the same codomain. 

A function whose domain is the set N of natural numbers is also called a 
“sequence,” and a different notation is usually preferred: if s : N — B is such 
a sequence, instead of s(n), it is customary to write s,, and the sequence itself is 
denoted by (sy)p. 


xXx Preliminaries 


The function f : R > R, defined by f(x) = x”, associates to every x € R its 
square. Notice that 


f(-—x) = f(x),  foreveryx ER. 


We will say that such a function is “even.” If, instead, as for the function f : R > R 
defined by f(x) = x°, one has that 


f(-x)=—f(x), foreveryx eR, 


we will say that such a function is “odd.” Clearly, a function could very well be 
neither even nor odd. 

Sometimes it could be useful to use the notation f(-) instead of just f. For 
example, if g : Rx R — Risa given function associating to each (x, y) € R? areal 
number g(x, y), then by g(-, y) : R — R we will denote the function x } g(x, y) 
for any fixed y € R. 

The “image” of the function f : A — B is the set 


f(A) ={f@:aeA}, 
and, in general, for every set U C A we can write 
fU)={f@:aeU}; 


it is the image of the function f|,y, the restriction of f to the domain U. 
The “composition” of two functions f : A > Band g: B — C is the function 
go f:A-— C defined as 


(go f(a) = g(f(a)). 


It could also be defined only assuming that the image of f is a subset of the domain 
of g. 
A function f : A > B is said to be 


e “injective” ifa; #a2 > f(a) 4 f(az); 
e “surjective” if f(A) = B; 
¢ “bijective” if it is both injective and surjective. 


If f : A — B is bijective, then for every b € B there is ana ¢€ A such that 
f(a) = b Cf is surjective), and such an element a is unique (f is injective). One 
can thus define a function from B to A that associates to every b € B the unique 
element a € A such that f(a) = b. This is the so-called “inverse function” of 
f : A — B, and it is usually denoted by f~! : B > A. Thus, 


f@=b © a=f'@). 


Preliminaries xxi 


The word “bijective” will thus be synonymous with “invertible.” Notice that, for 
everya € Aandbe B, 


fl¢@=a, Ff b)=b. 


For any function f, whether invertible or not, given a set V C B, it is common 
practice to write 


f 'V) ={aeA: f@eV}. 
This is the so-called “counterimage set” of V; it is composed of those elements a of 
A whose associated element f(a) belongs to V. 
To conclude this brief presentation, let us recall that, given two functions f : 


A — Bandg: A — B, if the codomain B has an addition operation, we can 
define the function f + g: A > Bas follows: 


(f+ s\(@= fla)+s@. 


Similar definitions can be given for the difference, product, and quotient of two 
functions: 


(f — g)(a) = fla) — g(a), 
(fg)(a) = f@ g@), 


f\._ f@ 
(£)@= g(a)’ 


Part | 


The Basics of Mathematical Analysis 


| 
Check for 
updates 


® 


In this chapter, we introduce the main settings where all the theory will be 
developed. First, we discuss the sets of numbers N, Z, Q, R, and C, then the space 
R%, and, finally, abstract metric spaces. 


1.1 The Natural Numbers and the Induction Principle 


In 1898 Giuseppe Peano, in his fundamental paper “Arithmetices principia: nova 
methodo exposita”, provided an axiomatic description of the set of natural num- 
bers N. We briefly state those axioms as follows: 


(a) There exists an element, called “zero,” denoted by 0. 
(b) Every element n has a “successor” n’. 
(c) 0 is not the successor of any element. 
(d) Different elements have different successors. 
(e) Induction principle: If S is a subset of N such that 
(i) OES, 
(jii)/neS>an'eS, 
then S = N. 


It is tacitly understood that condition (ii) must be verified for any n € N. We 
may therefore read it in the following way: 


(ii) If, for some n, we have that € S, then alson’ € S. 


We then introduce the familiar symbols 0’ = 1, 1’ = 2, 2’ = 3, and so on. 
From these few axioms, making use of set theory, Peano showed how to recover 
all the properties of the natural numbers. In particular, we can define addition and 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 3 
A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978- 3-03 1-23713-3_1 


4 1 Sets of Numbers and Metric Spaces 


multiplication such that 
no=n+1. 
Moreover, writing m < n whenever there exists a p € N such that m+ p =n, 


we obtain an order relation. We will assume here that all the properties of addition, 
multiplication, and the order relation defined on N are well known. 


1.1.1 Recursive Definitions 

The induction principle can be used to define a sequence of objects 
Ao, A1, Az, A3, .-. 

We proceed in the following way: 


(j) We define Ao. 
(jj) Assuming that A, has already been defined, for some n, we define Ay+1 . 


In this way, if we denote by S the set of those n for which A, is well defined, it is 
easy to see that such a set S confirms (7) and (i7) in the induction principle. Hence, 
S coincides with N, meaning that every A, is well defined. 

For example, we can define the “powers” a” by setting 


(ji) a2 =1, 
(jj) a"*! =a-a". 


We then verify that! 


a'=a a9 =a-l=a, 
a’ =a-a'=a-a, 
a=a-a=a-a-a 
at =a-a2>=a-a-a-a 


Henceforth, we will assume that all elementary properties of powers are well known. 


‘If a = O, it is sometimes a subtle matter to define 0°. However, in this book we will always 
assume that 0° = 1. 


1.1 The Natural Numbers and the Induction Principle 5 


Let us now define the “factorial” n! by setting 


(j) O!=1, 
ji) a+ )!=(+1)-nl. 


We then see that 
1!=1-0!=1-1l=1 
23=2-1!=2-1 
31=3-2!=3-2-1 
44=4.3!=4.3.2-1 


Finally, let us define the “summation” of ao, a@1,..., @, using the notation 
n 
de. 
k=0 
which reads “the sum of a, when k goes from 0 to n.” We set 


0 
ou 
k=0 


n 
and, assuming that > az has been defined for some n, we set 
k=0 


n+1 n 


Sree 
k=0 k=0 


In the preceding notation, an index appears, denoted by k; it takes all the integer 
values between 0 and n. Informally, 


n 
Soap = a0 +o $0 +--+ + Gy. 
k=0 


The fact that the index is denoted by the letter k is unimportant; we can use any 
other letter or symbol to denote it, for instance 


n n n n 
detis Dow, Diem, Dae, 
j=0 l=0 m=0 +x=0 


6 1 Sets of Numbers and Metric Spaces 


Notice that the same sum could be written 


n+1 n+2 n+m 
oukee - ube te ae 
k=1 k=? k=m 


or even 
n 

Sent. 

k=0 
As you can see, the use of the summation symbol has many variants, and we will 
sometimes need them in what follows. 
1.1.2 Proofs by Induction 
The induction principle can also be used to prove a sequence of propositions 

Po, Pi, Po, P3,... 

We must proceed as follows: 


(j) We verify Po. 
(jj) Assuming the truth of P, for some n, we verify Py+1. 


In this way, denoting by S the set of those n for which P, is true, then S verifies 
both (i) and (i) in the induction principle. Hence, S coincides with N, so all 
propositions P, are true. 

We now provide some examples. 

Example I We want to prove the Bernoulli inequality 

Pet (l+a)">1+na. 

We first see that Po is true since surely (1 + a)° > 1+0-a. Let us now assume 
that P, is true for some n; under this assumption, we need to verify P,+1. Indeed, 
we have 

(1+a)y"*! = (14+a)"(1+a) > (14na)(1+a) = 1+(n+Da+na? > 14+(n4+Da, 


hence, P,+1 is also true. In conclusion, P, is true for every n € N. 


Remark 1.1 In this section we are dealing with natural numbers; however, the 
Bernoulli inequality is true as well for any real number a > —1, and the proof 


1.1 The Natural Numbers and the Induction Principle 7 


is exactly the same. Similar remarks could be made for the other formulas in the 
following discussion. 


Example 2 The following properties of summation can be proven by induction: 
n n n 
Yiu + Be) = Yo ae + Do Be. 
k=0 k=0 k=0 
which informally reads 
(a0 + Bo) + (a1 + Bi) +: +++ (On + Bn) = (20 +a1 ++ ++ +0n)+ (Bo+B1 +--+ +Bn); 


and 
n 
Yar) = C( Prax), 
k=0 
which informally reads 
Cao + Ca, +---+ Cay, = C(ag tay +---+ay). 


Let us prove, for instance, the first one. We first verify that it holds for n = 0, with 


0 0 0 


Yi(on + Be) = a0 + Bo = Doan + D> Be- 


k=0 k=0 k=0 


Assuming now that the formula is true for some 1, we have 


n+1 n 
Yoo + Be) = YO (@e + Be) + (Ome + Bn+1) 
k=0 k=0 


= 7 + Dae: + (nti + Bn+1) 


k=0 k=0 
n+1 n+l 


= (Sau + ans) + (Det Boss) =)ia+ >> &, 
k=0 k=0 k=0 k=0 


so that the formula also holds for n + 1. The proof is thus complete. 


8 1 Sets of Numbers and Metric Spaces 


Example 3 The following formula involves a “telescopic sum”: 


n 


Sat — dk) = an41 — a0, 
k=0 


which can be visualized as 
(41 — Go) + 2 — 81) + 3 — Ha) ++ + Qn — An) + Gn 41 — An) = Gn — 0- 
It can also be proved by induction. 


Example 4 Let us prove the identity 


n 
Ph : antl — pnt! = (a —b)( rater). 
k=0 
We first verify Po, i.e., 


qt! = pot! = (a =, b)a°p°-° ; 


which is clearly true. Assume now P, to be true for some n € N; then 


n+l n 
(= aOsae) Eilges plOS aes, Kio= hare 
= (a—b)( rab" )b + (a — bya"! 
k=0 


= (qn = b'*!)b + (a _ b)a"t! = qt? _ prt 
so that P,+1 is also true. Thus, we have proved that P, is true for every n € N. 
As particular cases of the preceding formula we have 


a’ —b? = (a—b)(a+b), 

a —b> = (a—b)(a* +.ab+b’), 

at — b+ = (a—by(ae3+a°b +ab’ +b°), 

a — b> = (a—b)at+ab +07? +.ab> + b4), 


1.1 The Natural Numbers and the Induction Principle 9 


Notice also the formula 


which holds for any a # 1, obtained from the preceding formula taking b = 1. 

In some cases it could be useful to start the sequence of propositions, e.g., from 
P, instead of Po or from any other of them, say, P;. However, one can always reduce 
to the previous case by a shift of the indices, so that the principle of induction indeed 
remains of the same nature. Briefly, to prove the propositions 


Pa, Pati, Pat2, Pit3,---, 
we verify the first proposition P; and then, assuming that P, is true for some n > n, 


we verify Pit. 
As an exercise, the reader could try to prove by induction the following identities: 


1 
2424 P pce pn? a MOFDOn+D 
6 , 
3__ n(n+1)* 


P+2434---4+0 = 7 
Notice the beautiful equality 


Pao ee poy Se O34 aay. 


1.1.3. The Binomial Formula 


Let us define, for any couple of natural numbers n, k such that k < n, the binomial 


coefficients 
n\ _ n! 
kk) kn —k! 


The following identity holds: 


10 1 Sets of Numbers and Metric Spaces 


n n\ | n! n! 
Ge i) = (;) ~ (k—-Din—-k+ 1)! . k\(n — k)! 


nik+ni(n—k+1) 
~ kn —k+ 1! 
ni(n + 1) (n+ 1)! 
~ KMn—-k+D! kMm+tb—b!’ 


It is sometimes useful to represent the binomial coefficients in the so-called 
Pascal triangle 


1.1 The Natural Numbers and the Induction Principle 11 


We will now prove, for every n € N, the binomial formula 
 (n k pk 
; fit n— 
P,: (a +b) -E(i)e bE. 
It will be necessary to prove separately the case n = 0 and then start the induction 


fromn = 1. 
If n = 0, then 


(Gib (p)are". 


and the formula holds. Assuming now n > 1, we proceed by induction. We first see 
that it holds when n = 1: 


1 1 
(a+b)! = (c)a"-%0° a ({)att ; 


Now, assuming that P, is true for some n > 1, we prove that P,+1 is also true: 


(a+b)"*! =(a+b)(at+b)" 


= (a+b) (> (;)eors) 


k=0 


n 


n—k+1 pk n n—(k-1) p(k-I+1 ntl 
a b > any b +b 


n 
_ ont n n —k+1 pk +1 
=a" +r [G)+G2)|@ bE + pb” 


n n-1 
ss git: » (j)artrtot sis > Grae 4 prtl 
k=0 


12 1 Sets of Numbers and Metric Spaces 


= 1 
= gttly 3 Oe Jaret 4 prt! 


k=1 
n+l 
=e (" v oars 
k 
k=0 


We have thus proved by induction that P,, is true for every n € N. 
As particular cases of the binomial formula we have 
(a+b) =a? +2ab4+b’, 
(a+b) =a? +3a*b + 3ab? +b, 
(a+ b)* =a* + 4a*b + 6a7b? + 4ab? + b*, 
(a+b) =a? +5a*b + 10a°b* + 10a2b? + 5ab*+ +b, 


1.2. The Real Numbers 
Starting from the set of natural numbers 
N = {0, 1, 2,3,...}, 


by the use of set theory arguments it is possible first to construct the set of integer 
numbers 


Z ={...,—3, —2, -1,0, 1, 2,3,...} 
and then the set of rational numbers 


Q=[—:meZneNnZ0}. 


This set has a lot of nice features from an algebraic point of view. Let us briefly 
review them. 


1. An “order relation” < is defined, with the following properties. 
For every choice of x, y, z 
(a) x <x. 
(b) [x < yandy<x] >x=y. 
(c) [x <yandy<z]> x <z. 
Moreover, such an order relation is “total” since any two elements x and y 
are comparable: 


1.2 The Real Numbers 13 


(d)x<y or y<x. 
If x < y, we will also write y > x. Ifx < y and y ¥ x, we will write x < y, or 
y>x, 
2. An addition operation + is defined, with the following properties. 
For any choice of x, y, z 
(a) (Associative) x + (y+z)=(*+y)4+z. 
(b) There exists an “identity element” 0: we havex +O0=x=0+4x. 
(c) x has an “inverse element” —x : we have x + (—x) =O = (—x) +x. 
(d) (Commutative) x + y=y+x. 
(e) Ifx < y,thenx+z<y+z. 
3. A multiplication operation - is defined, with the following properties. 
For any choice of x, y, z 
(a) (Associative) x -(y-z) = (x- y)-Z. 
(b) There exists an “identity element” 1: wehavex-l=x=1-x. 
(c) Ifx # 0, then x has an “inverse element” x—!: wehave x-x~! = 1=x7!-x. 
(d) (Commutative) x-y=y-x. 
(e) Ifx < yandz>0,thenx-z<y-z, 
and a property involving both operations: 
(f) (Distributive) x - (y+ z) = (x-y)+(%-z). 


A set satisfying the foregoing properties is called an “ordered field.” The set Q 
1s, in some sense, the smallest ordered field. 

We will often omit the symbol - in multiplication. Moreover, we adopt the usual 
notations, writing z = y—xifz+x=yandz= * if zx = y, withx #0. In 
particular, x~! = i. 

We rediscover the set N as a subset of Q. Indeed, 0 and 1 are the identity elements 
of addition and multiplication, respectively, and then we have 2 = 1+1,3=2+1, 
and so on. 

Besides its nice algebraic properties, the set of rational numbers Q is not rich 
enough to deal with such an elementary geometric problem as the measuring of the 
diagonal of a square whose side’s length is 1, as the following theorem states. 


Theorem 1.2 There is no rational number x such that x? = 2. 


Proof By contradiction, assume that there exist m,n € N different from 0 such that 


i.e., m2 = 2n”. Then m needs to be even, and so there exists a nonzero m1 € N such 
that m = 2m. We thus have 4m* = 2n?, ice., 2m+ = n’. But then n also needs to 
be even, and so there exists a nonzero n,; € N such that 2n; = n. Hence, 


14 1 Sets of Numbers and Metric Spaces 


We can now repeat the argument as many times as we want, continuing the division 
by 2 of numerator and denominator: 


m m, m2 m3 mk 
n 


ny ng N3 Nk 


where m,; and n,x are nonzero natural numbers such that m = me, n= ng. 
Then, since ny > 1, we have thatn > 2 for any natural number k > 1. In particular, 
n > 2”, But the Bernoulli inequality tells us that 2” = (1+ 1)” > 1-+n, and all this 
implies that n > 1 +n, which is clearly false. | 


Therefore, one feels the need to further extend the set Q so as to be able to deal 
with this kind of problem. It is indeed possible to construct a set R, containing 
Q, which is an ordered field and, hence, satisfies properties (1), (2), and (3) and 
moreover satisfies the following property. 


4. Separation Property. Given two nonempty subsets A, B of R such that 
VaeA VheB ax<b, 
there exists an element c € R such that 


VaeEeA Whe B ax<cx<b. 


Mathematical Analysis is based on the set IR. We will assume that the reader is 
familiar with its elementary algebraic properties. 


1.2.1 Supremum and Infimum 


In this section we analyze some fundamental tools in R. Let us start with some 
definitions. 

A subset E of R is said to be “bounded from above” if there exists a € R such 
that, for every x € E, we have x < a; such a number a is then an “upper bound” 
of E. If, moreover, a € E, then we will say that a is the “maximum” of E, and we 
will write « = max E. 

Analogously, the set E is said to be “bounded from below” if there exists 6 € R 
such that, for every x € E, we have x > 6; such a number £ is then a “lower bound” 
of E. If, moreover, we have that 6 € E, then we will say that 6 is the “minimum” 
of E, and we will write 6 = min E. 

The set E is said to be “bounded” if it is both bounded from above and below. 

Some remarks are in order. The maximum, when it exists, is unique. However, 
a set could be bounded from above without having a maximum, as the example 
E = {x € R: x < 0} shows. Similar considerations can be made for the minimum. 


1.2 The Real Numbers 15 


Theorem 1.3 /f E is nonempty and bounded from above, then the set of all upper 
bounds of E has a minimum. 


Proof Let B be the set of all upper bounds of E. Then 
VaeE Whe B a<b, 
and by the separation property there exists ac € R such that 
VaeE Whe B a<cx<b. 


This means that c is an upper bound of E, and hence c € B, and it is also a lower 
bound of B. Hence, c = min B. | 


If E is nonempty and bounded from above, the smallest upper bound of E is 
called the “supremum” of E: it is a real number s € R that will be denoted by 
sup E. It is characterized by the following two properties: 


(i) VxEeE x<s. 
(ii) Vs’ <s AxEE: x>s’'. 


The two preceding properties can also be equivalently written as follows: 


(i) VxEe FE x<s. 
(ii) Ve>O Axe E: x>s-e8. 


In the second expression, we understand that the number ¢ > 0 can be arbitrarily 
small. 

If the supremum sup E belongs to EF, then sup E = max E; as we saw earlier, 
however, this is not always the case. 

We can state the following analogue of the preceding theorem. 


Theorem 1.4 /f E is nonempty and bounded from below, then the set of all lower 
bounds of E has a maximum. 


If E is nonempty and bounded from below, the greatest lower bound of F is 
called the “infimum” of E: It is a real number? € R that will be denoted by inf E. 
It is characterized by the following two properties: 


(i) VxEE x>t. 
(ii) Wt' >t Axe E: x<t’. 


16 1 Sets of Numbers and Metric Spaces 


The two foregoing properties can also be equivalently written as follows: 


(i) VxEE x>t. 
(ii) Ve>O Axe E: x<tt+e. 


If the infimum inf F belongs to E, then inf E = min E; however, the minimum 
could not exist. 
Notice that, defining the set 
E~={xeR:-xe E}, 
we have 
E is bounded from above © E’ is bounded from below, 
and in that case, 
sup E = —inf E 
while 
E is bounded from below <= E is bounded from above, 
and in that case, 
infE =—supE. 
In the case where E is not bounded from above, we will write 
sup EF = +00. 
Theorem 1.5 The set N is not bounded from above, i.e., sup N = +00. 
Proof Assume by contradiction that N is bounded from above. Then s = sup N is 


a real number. By the properties of the supremum, there exists ann € N such that 
n>s— 5. But thenn + 1 € N and 


1 
REESE he ss 


thereby contradicting the fact that s is an upper bound for N. a 


1.2 The Real Numbers 17 


In the case where E is not bounded from below, we will write 
inf E = —o. 


For instance, we have inf Z = —oo. 


1.2.2 The Square Root 

The following property will be used several times. 

Lemma 1.6 /f0 < a < B, thena? < f?. 

Proof If0 <a < B, thena® = aa < af < BB = fp. | 


We will now prove that there exists a real number c > 0 such that c* = 2. Let us 
define the sets 


A={xeR:x>0 and x7 <2}, B={xeR:x>0 and x7 >2}. 

Let us check that 

VaeEeA Whe B ax<b. 
Indeed, if not, it would be 0 < b < a, and, hence, by Lemma 1.6, b? < a*. But we 
know that a2 < 2 and b? > 2, hence a* < b*, so we find a contradiction. By the 
separation property, there is an element c € R such that 

VaeA VbeB ax<cx<b. 

Notice that, since | € A, it is surely the case that c > 1. We will now prove, by 


contradiction, that c? = 2. 
If c? > 2, then, forn > 1, 


hence, ifn > 2¢/(c? — 2), since c > 1 andn > 1, then 


1 17 
c—-—>0 and c—-—) >2, 
n 


n 


so that c — ‘ e€ B. But thence < c — i, which is clearly impossible. 


18 1 Sets of Numbers and Metric Spaces 


If c2 < 2, then, forn > 1, 


Ie 2 1 Qe 1 2e+1 
(<+-) HC ++ 58504 4-50? + ——; 
n non non n 

hence, ifn > (2c + 1)/(2 —c?), then (c + 4)” < 2, and therefore c + 4 € A. But 
then c + ‘ < c, which is impossible. 

Since both assumptions c* > 2 and c? < 2 lead to a contradiction, it must be that 

2 

c= 2. 

Lemma 1.6 also tells us that there cannot exist any other positive solutions of the 
equation 


x= 2 
which therefore has exactly two solutions, x = c and x = —c. 

The same type of reasoning can be used to prove that, for any positive real 
number r, there exists a unique positive real number c such that c* = r. This 
number c is called the square root of r, and we write c = /r. Notice that the 
equation x* = r has indeed two solutions, x = /7 and x = —,/r. One also writes 


J/0 = 0, whereas the square root of a negative number remains undefined. This 
subject will be reconsidered in the framework of the complex numbers. 
At this point we are ready to deal with the quadratic equation 


ax? +bx+c=0, 


where a, b, and c are real numbers, with a # 0. It can be written equivalently as 
follows: 


( ) b? — dac 
ye) 2 —___. 
2a 


Thus, we see that the equation is solvable if and only if b? — 4ac > 0, in which case 
the solutions are 


—b+ Jb? — 4ac 
x= —___——__. 
2a 


Let us now define the “absolute value” (or ““modulus’”’) of a real number x as 


Ix Pe) x ifx>0, 
x| = . o— 


—x ifx <0. 


1.2 The Real Numbers 19 


The following properties may be easily verified. For every x1, x2 in R, 
|x1x2| = |x1| x21, 
whereas 
Ix1 + x2] < [eal + [x2]. 
We will sometimes also need the inequality 
[lx] — |xal| < [x1 — x2! 


and the equivalence 


1.2.3 Intervals 
Let us explain what we mean by “interval.” 


Definition 1.7 An interval is a nonempty subset J of R having the following 
property: if a, 6 are two of its elements, then 7 contains all the numbers between 
them. 


We will not exclude the case where J only has a single element. 


Proposition 1.8 Let I be an interval, define a = inf I, b = sup! (possibly a = 
—oo or b = +00), and assume a # b. Ifa < x <b, thenx € I. 


Proof If a < x < b, then, by the properties of the infimum and supremum, we can 
always find a and 8 in J such thata < a < x < B < b. Thus, by the preceding 
definition, J contains x. |_| 


By the foregoing proposition, distinguishing the cases where a and b can be real 
numbers or not and whether or not they belong to 7, we can conclude that any 
interval J must be among those in the following list: 
={x:a<x<b}, 
={x:a<x <b}, 
={x:a<x <b}, 
={x:a<x <b}, 


a, too[= {x : x >a}, 


20 1 Sets of Numbers and Metric Spaces 


Ja, tool = {x : x >a}, 
]— 0, b] = {x : x < 5}, 
]-~w,bl= {x: x <b}, 


R, sometimes denoted by ] — oo, +00[. 


Note that, when a = J, the interval [a, a] reduces to a single point. In that case we 
say that the interval is “degenerate.” 


Theorem 1.9 (Cantor Theorem) Let ([;), be a sequence of intervals of the type 
In = (an, by), with ay < by, such that 


Ib2DHDhH2DL2D... 
Then there is ac € R that belongs to all the intervals In. 


Proof Let us define the two sets 


A={a,:neéeN}, 

B={b,:neéN}. 
For any a, in A and any b,, in B (not necessarily having the same index), we have 
that a, < by». Indeed, ifn < m, then I, D In, hence ay < am < bm < bn. On the 
other hand, ifn > m, then I, D In, so that ay < an < by < bm. We have thus 
proved that 

VaEeA Whe B ax<b. 

By the separation property, there is ac € R such that 


VaeA VheB ax<cx<b. 


In particular, a, < c < bn, which means that c € J, for everyn EN. a 


1.2.4 Properties of Q and R \ Q 
We will now study the “density” of Q and R \ Q in the set of real numbers R. 


Theorem 1.10 Given two real numbers a, B, with a < B, there always exists a 
rational number in between them. 


1.2 The Real Numbers 21 


Proof Let us consider three different cases. 
First Case: 0 < a < B. Choose n € N such that 


1 
pao 


n> 


and let m € N be the greatest natural number such that 
m<np. 


Then clearly “ < £, and we will now show that it must be that * > a. By 
contradiction, assume that 7 <a; then 


m+1 


n 


<a+—<at(b-a)=8, 


meaning m + 1 < nf, in contradiction to the fact that m is the greatest natural 
number less than nf. 

Second Case: a < 0 < B. It is sufficient to choose 0, which is a rational number. 
Third Case: a < B < 0. We can reduce this case to the first case, changing signs: 


since 0 < —B < —a, there exists a rational number 7 such that —B < “ < —a. 


Hence, a < —# < £. a 
n 


Theorem 1.11 Given two real numbers a, B, with a < B, there always exists an 
irrational number in between them. 


Proof By the previous theorem, there exists a rational number = such that 
m 
6 a) DS ep af. 
n 


Consequently, 


with =- /2¢Q | 


We will now discover a crucial difference between the sets Q and R \ Q. Let us 
consider the following sequence of nonnegative rational numbers: 


9 


wily <— oo 
Nw <— 
HAS S 
we << 
ANS 
Luo <— 
NIB <— 
In <— 


13. 14) 15 


-IO<- © 
BIR 
NIF<— WN 
RIN <— Wo 
We <— 
NIN <— N 
HID <— OO 
AIRK< YN 


22 1 Sets of Numbers and Metric Spaces 


As you can see, the sequence is built choosing rational numbers having the sum 
of their numerator and denominator equal to 1, then 2, then 3, and so on. In this 
way, all nonnegative rational numbers will sooner or later appear in the list. We now 
modify it in order to make it injective, checking from the beginning all the numbers 
in the list, one by one, and eliminating those that had already appeared previously: 


0 12 3 4 5 6 7 8 9 10 11 12 13 14 15 
, ++ YY YY YY YY YY YY 
0 1 1 2 1 3 1 2 3 4 1 5 1 2 4 
T T 2 T 3 T 4 3 2 T 5 T 6 3 4 3 


We are now ready to introduce the negative numbers as well, so as to obtain all 
the rationals: 


012 3 4 5 6 7 8 9 10 11 12 13 14 15 
4, YY YY YY YY YY YY YY 
O lo 2 2 2 2 2 2 2 3 2 1 1 2 2 3 
1 1 1 2 2 1 1 3 3 1 1 4 4 3 3 2 


In this way, we have constructed a bijective function g : N > Q. We will thus 
say that Q is a “countable” set. 

Let us now prove that R is not countable, i.e., there cannot exist any bijective 
function yg : N > R. We will show this assuming by contradiction that there exists 
a surjective function y : N — [0, 1]. Divide the interval [0, 1] into three equal 
parts: 


Choose one of them, and denote it by Jo, with the property that (0) ¢ Io. Now 
iterate this procedure: divide Jp into three equal parts and denote by 1, = [a1, b1] 
one of these with the property that w(1) ¢ 7. Then divide 7; into three equal parts 
and denote by /2 = [a2, b2] one of these with the property that (2) ¢ tb, and so 
on. In this way, we have constructed a sequence of intervals J, = [adn, bn] with the 
property that y(n) ¢ I, for every n € N and 


Ip DK DhIDIho2D... 


By Cantor’s Theorem 1.9, there is ac € R that belongs to all the intervals /,,. Hence, 
c # y(n) for every n € N, contradicting the surjectivity of the function y. 

We now claim that R \ Q cannot be countable. Indeed, if it were countable, we 
would have an injective sequence (a), such that {a, :n € N} = R \ Q. On the 
other hand, since we know that Q is countable, there is an injective sequence (Bn) 


1.3. The Complex Numbers 23 
such that {8, :n € N} = Q. Then the sequence defined as 
ao, Bo, a1, Br, a2, B2, 03, B3,... 

would contain all real numbers, and we know that this is impossible. Our claim is 
thus proved. 
1.3. The Complex Numbers 
Let us consider the set 

RxR={(a,b):a€eR,beR}, 
which is often denoted by R. We define an addition operation as 

(a,b) + (a,b) =(at+a',b+b'). 


The following properties are readily verified. For any choice of (a,b), (a’, b’), 
(a", b”), 


(a) (Associative) (a, b) + [(a’, b') + (a", b”)] = [(a, b) + (a’, b')] + Ca", b"). 
(b) There exists an “identity element” (0, 0): We have 


(a, b) + (0, 0) = (a, b) = (0,0) + (a, b). 
(c) (a, b) has an “inverse element” —(a, b) = (—a, —b): We have 
(a, b) + (—a, —b) = (0, 0) = (—a, —b) + (a, b). 
(d) (Commutative) (a, b) + (a’, b’) = (a’, b’) + (a, b). 
We also define a multiplication operation - as 
(a, b) - (a’, b') = (aa’ — bb’, ab! + ba’). 


We can then verify the following properties. For any choice of (a,b), (a’, b’), 
(a”, b”), 


(a) (Associative) (a, b) - [(a’, b’) - (a", b')) = (a, b)- @’, b*)]- (a", b"). 
(b) There exists an “identity element” (1, 0): We have 


(a, b)- 1,0) = (a, b) = C1, 0) - (a,b). 


24 1 Sets of Numbers and Metric Spaces 


(c) If (a, b) 4 (0, 0), then (a, b) has an “inverse element” 


a —b 
,b) | =( +=, .=— 5} . 
a2) (ate =F) 


We have 


Gt a gia le 
a, 3 ; = : = ; “la, é 
az+b2’ a2+ b2 a2+b2” q2+ b2 


(d) (Commutative) (a, b) - (a’, b') = (a’, b’) - (a, b). 
(e) (Distributive) (a, b)-[(a’, b') + (a”, b")] = [(a, b)-(a’, b'‘)] + La, b)-(a”", b”)]. 
(Henceforth, we will often omit the sign “-”.) In this way, (R?, +,-) has the 
algebraic structure of a field; it is indicated by C and referred to as a complex field. 
Its elements will be called “complex numbers.” 

We can view C as an extension of R identifying each element of the type (a, 0) 


with the corresponding real number a. The operations of addition and multiplication 
are indeed preserved: 


(a, 0) + (a’,0) = (a+a’,0), 
(a, 0) - (a’, 0) = (aa’,0). 


We now focus on the identity 
(a, b) = (a, 0) + (0, 1)(, 0). 
It is worth introducing a new symbol for the element (0, 1). We will write 
(0,1) =i. 
In this way, having identified (a, 0) with a and (b, 0) with b, we can write 
(a,b) =a-+ib. 


For any complex number z = a + ib, the real numbers a and b are called the “real 
part” and “imaginary part” of z, respectively, and they are denoted by 


a=(z), b= Xz). 
Now we present a crucial identity: 


i? = (0, 1)(0, 1) = (-1,0) =-1. 


1.3. The Complex Numbers 25 


Using this simple information, we can verify that all the usual symbolic rules are 
satisfied; for instance, 


(a+tib)+(@+ib) =(a+a’)+i(bt+bd’). 
(a +ib)(a' +ib’) = (aa’ — bb’) +i(ab’ + ba’). 
We are therefore allowed to manipulate complex numbers using all the algebraic 
rules we know well. In the next section we provide an example. 
1.3.1 Algebraic Equations in C 


Let z = a+ ib be a fixed complex number, with a,b € R. We want to solve the 
equation 


US 
We will refer to the solutions u € C as “complex square roots” of the number z (or 


“square roots” for short, being careful not to confuse them with the notion of square 
root already introduced in R). If b = 0, then we find 


ti/—a ifa<0O. 


ee ifa>0, 


Otherwise, if b ~ 0, then let us write u = x + iy. Then 


xe—-y=a, 2Axy=b. 


Since b 4 0, we have x #0 and y 4 0. We can then write y = x and obtain 


b2 
xt —ax*-— =0, 


4 


whence 


Thus, we have found the two solutions 


at+vVat+b2 . b 


u==ar ——— +1 


7 V2(a+ Va? +b?) 


26 1 Sets of Numbers and Metric Spaces 


We now turn to the quadratic equation 
au’ +bu+c=0, 


where a, b, and ¢ are any fixed complex numbers, with @ 4 0. As we saw in the 
real case, this equation is equivalent to 


(u+ py = Pasar, 
OT Fa) Qa” 


hence, setting 


és b* — 4ac 
v=u ae) a ay eae ae | 
oa? 


we are led to v* = z, ie., to the problem of finding the square roots of z, a problem 
we already know how to solve. 
To conclude, for a more general polynomial equation 


Gnu" + Qy—u"! +--+ aju+ao=0, 


where Qo, Q1,..., @y are any fixed complex numbers, with a, 4 0, we have the 
following theorem. 


Theorem 1.12 (Fundamental Theorem of Algebra) Any polynomial equation 
has, in the complex field, at least one solution. 


The problem of finding a general procedure to determine the solutions of 
the foregoing equation has troubled mathematicians for a very long time. We 


encountered it in the case n = 2, and it was also settled if n = 3 or 4. However, if 
n > 5, then it has finally been proved that such a general procedure does not exist. 


1.3.2 The Modulus of a Complex Number 


We now examine some additional properties of complex numbers. If z = a + ib, 
with a, b € R, then we define the “modulus” of z, 


lz] = Va2+b?. 


Notice that, if z = a € R, then we recover the absolute value 


ifa>0, 
i= Ds a ifa> 


—a ifa<0O. 


1.3. The Complex Numbers 27 


Given two complex numbers z, and zz, let us verify the identity 
[z1z2] = Izillz2l. 
Indeed, if z] = ay + iby and z2 = a2 + ibo, then 


Iz1z2|? = (aia2 — byb2)? + (aib2 + bya2)* 


= ajay — 2ayanbib2 + by bs + ath + 2aiboba2 + bias 
= atlas + b?bs + aib3 + bias 


(a? + b2)(a3 + b3) = lil lz2/*. 


In particular, if the two numbers coincide, then 

Iz"| = Iz’, 
and it can be proved by induction that, for every n € N, 

Iz"| = Iz". 
Moreover, if z 4 0, since \z~!z| = 1, then we have 

Je" = Il. 
Hence, for any positive integer n, 
=e FS kf Str elk 
Thus, we have seen that the equality |z”| = |z|” holds for every n € Z. 
For any complex number z = a + ib let us define its “complex conjugate” z* = 


a — ib (sometimes denoted by Z). The following properties hold: 


(Z1 + 22)" = zp +2), 


(Zia) = 2425 5 


Zz =Z, 
Iz*| = lzl, 
died Fela 


R(z1 + z2) = R(z1) + R(z2), S(z1 + 22) = S(z1) + 3(z2), 
N() = =( ! Oe i) 
Hy — ; aS — a _ 7 
Z 5 ZZ Zz oF GHZ 


IR <slzl, IS@IS lz, 


28 1 Sets of Numbers and Metric Spaces 


and, if z 4 0, 


* 

4 z 

Z =>. 
[z|? 


Let us now prove the following subadditivity property of the modulus: 
|z1 + Za = |Z1| + [zal - 
Indeed, 


lz + ol” = (21 + 22)(z1 + 22)" 
= (21 + 22)(Z7 + 25) 
= 212] +2125 + 222} + 2225 
= |zi|? + zizh + (zizh)* + |zal? 
= lei? + (e123) + [eal? 
< zal? + 2lziz3l + |zal” 


2 2 
= Jail + 2lzil |z3/ + Izal 


ay 


= |Z 


e 


* +4 Qzal [zal + Izal? = (zal + Iz2l)?, 


and Lemma 1.6 completes the proof. 


1.4 The Space RY 


Let us introduce the set R’, composed of the N-tuples (x1, x2, ... , Xv), where x1, 
xX2,...,Xy are real numbers. We will denote its elements by the symbols 
ee as 


Let us start by defining an addition operation in R . Given two elements 
w@=(x1,%2,...,xy) and 2’ = (x45; a ses 
we set 


eta’ = (x1 4+x},x2 +25, bee XN +X). 


1.4 The Space RY 29 


The following properties hold. For any choice of a, x’, x”, 


(a) (Associative) (#@ + x’) + a2” = a4 (a’4+ 2”). 

(b) There exists an “identity element” 0 = (0,0, ... , 0). 
We have x +0=2=0+42. 

(c) @ = (x1, x2, ...,Xy) has an “inverse element” (-x%) = (—x1,—Xx2,..., 
—xy). We have @ + (-x%) = 0= (-4%) 4+ 2. 

(d) (Commutative) x + xv = av’ + a. 


Therefore, (RX , +) is an“abelian group.” As usual, we write a — a’ to denote 
x2t+(-2’). 

We now define the product of an element of R% by a real number. Given 2 = 
(X1,X2,...,XN) € RY anda € R, we set 


aw = (AX1,ax2,...,AXxy). 
The following properties hold: 


(a) a(Bx) = (aB)x. 

(b) (a+ B)xw = (wx) + (Ba). 
(c) a(a@ +2’) = (ax) + (aav’). 
(d) lx=a. 


With the preceding operations, R is a “vector space,” and we will call its 
elements “vectors.” In this environment, the real numbers will be called “scalars.” 

It would be useful to introduce here the “scalar product” of two vectors. Given x 
and a’ as previously, we define the real number 


N 
aig!) x, Sax + x2x5 $+ + xx). 
k=1 


The scalar product is also denoted by a variety of symbols, for example, 
(elx’), (x, x’), (wx), (x, 2’). 
The following properties hold: 


(a)v@-x>0. 

(b)x-r=0 5 v=0. 

(c) (e@+a’)- a2" = (a- a") 4 (x! - 2”). 
(d) (aw). a’ =a(x-a’). 

(ec) w@- ev =a2'-a. 


30 1 Sets of Numbers and Metric Spaces 


If x - x’ = 0, we say that the two vectors x and a’ are “orthogonal.” 

Let us finally define, but only in the three-dimensional space R?, the “cross 
product” of two vectors. Given & = (x1, x2, x3) and x’ = (x/, x5,.x3), we set 

a x a! = (xox, — 13x54, 13K, — X13, X1K5 — X2%}). 
It can be verified that x x a’ is orthogonal to both a and a’: 
(2x a’)-x=0, (x x x')- a =0. 
Moreover, when « x x’ 4 0, since 
xX] x2 X3 
det x4 X45 XS >0, 
XQX4 — X3X5 XZX} — X1Xz AX, — XX} 
the direction of x x a’ is provided by the so-called “right-hand rule.” This means 
that the triple (a, a’, x x a’) has the same orientation as (€;, €2, €3), the canonical 
basis 
e,=(1,0,0), e2=(0,1,0), e3=(0,0,1). 
What follows are some properties of the cross product: 
(a) (a@+a’) x a” = (a x we") 4+ (a x 2”); 
(b) (ax) x a’ =a(a x a’); 
(c) ©xa’=-a’ xa. 
One can also prove the Jacobi identity 
a x (a! x w")+ a! x ("x vw) ta" x (ax a’) = 0. 


Finally, note that 


e.xer.=e3, eErxXxe3=e€1, E3 XE, =e2. 


1.4.1. Euclidean Norm and Distance 


Starting from the scalar product, we can define the “Euclidean norm” of a vector 
t= (x1, X2,...,Xy) as 


1.4 The Space RY 31 


The following properties hold: 


(a) ||x\| = 0. 
(b) jz] =0 > e=0. 
(c) |laa|| = || lla]. 


(d) \|w + a"|| < |x|] + |’. 


To prove the subadditivity property d), we need the following Schwarz Inequal- 
ity. 


Theorem 1.13 For any two vectors a, x’, we have that 
|x -x'| < ||| |x". 
Equality holds if and only if a and a! are linearly dependent. 


Proof The inequality surely holds if 2’ = 0, since in that case x - x’ = 0 and 
\|x’|| = 0. Assume, then, a’ 4 0. For any a € R we have 


0 < la —aa' ||? = (a — aa’) - (a — aa’) = |lal|* — 20@ - a! +07 |e" |? 


Taking a = ae a - a’, we obtain 


1 1 1 
0 < lla? —2 (a - an’)? + (w - 2’)? |x"? = all? — (a - a’), 
eae eae Ea 


whence the inequality we wanted to prove. 
Concerning the second part of the statement, we can assume that both vectors x 
and a’ are different from 0. If a is equal to ax’, for some a € R, then 


2 
\(ax’)-x'| = Jal|a’ x"| = Ja|l|a" ||" = lex’ |||’ . 
Conversely, if a - a’ = ||x|| ||a’||, then 
[= c x: a’ 
= ’ 
jal) || || || "|| 
hence Tz — 4 T: Because the two vectors have the same direction, io! are 
linearly dependent. Similarly, if x - a2’ = —||a|| ||x’||, then we can see that =— Teal = = 
x’ | 


~ Tay 


32 1 Sets of Numbers and Metric Spaces 


Let us now prove property (d) of the norm. Using the Schwarz inequality, 
je + xl? = (e+e!) (w +a’) 
= |e? + 2a - ae! + Ile’? 


< lla]? + 2Macl] [lae" || + le"? = Cel] + |e"), 


whence the inequality we were looking for. 
At this point, we cannot avoid taking a look at the parallelogram identity 


2 2 2 2 
Jat all" + lla — a> = (all + lel), 
which in simple words states that, in any parallelogram, the sum of the squares of 
its two diagonals is equal to the sum of the squares of its four sides. 


Let us define now, using the norm, the “Euclidean distance” between two 
elements @ = (x1, %2,...,xn) and @ = (x),%4,...,%y) of R% as 


d(x, x’) = ||ja—a'|| = 


The following properties hold: 


(a) d(x, x’) > 0. 

(b) d(a,2/)=0 8 w=2’. 

(c) d(a, 2’) = d(x’, x). 

(d) d(a, x") < d(x, a’) + d(a’, x”). 


The last property is called the “triangle inequality.” Let us prove it as follows: 


d(x, x") =|lja—2" | 


= |\(@ — aw’) + (a — @")| 


< |la—a'|| + Ila’ — "|| = d(w, x’) + d(a’, x"). 


In general, a real vector space V is said to be a “normed vector space” if there is 


a function || - || : V > R, a “norm,” with the following properties. For any x, x’ in 
V,and~aeR 

(a) \|x||> 0. 

(b) |x| =0 @ x=0. 

(c) |lox|| = lel [lx ll. 


(d) lx + "|| S lll + Ie'I. 


1.5 Metric Spaces 33 


As we have seen, R” is a normed vector space, with its Euclidean norm. Note 
however that different norms of a vector @ = (x1, x2,..., xy) could also be defined 
on RY, for example, 


N 
elle = Do [xel, or [la llak = max{|xg] 2k = 1,2,...,N}. 
k=1 


1.5 Metric Spaces 


For any nonempty set £, a functiond : E x E — Ris said to be a “distance” (on 
E) if it satisfies the following properties: 


(a) d(x,x')>0. 

(b) d(x,x")=0 S&S x=x’'. 

(c) d(x,x'!)=d(x',x). 

(d) d(x, x") < d(x, x')+d(x’,x”) (the triangle inequality). 


The set E, provided with the distance d, is said to be a “metric space.” Its 
elements will often be referred to as “points.” 

We have seen that R”, provided with the Euclidean distance, is a metric space (in 
what follows, when speaking of R™ as a metric space, if not explicitly mentioned, 
we will always assume the given distance to be the Euclidean one). In the case 
N = 1, we have the usual distance on R, i.e., d(a, 8B) = |a — BI. 

More generally, any normed vector space is a metric space with the distance 
d(x, x’) = ||x — x’||. The fact that this is indeed a distance can be verified using the 
same approach as for the Euclidean distance. 

It is also possible to consider different distances on the same set. For instance, 


taking two elements @ = (x1,x2,...,xn) and @ = (x},x5,...,x))) of RY, the 
function 
N 
dy(@, a) = |x — "lle = D> [xe — x4 
k=1 


is also a distance on R™. The same is true of the function 


dyx(X, 2’) = \| az > Ilse = max{|xx — x, : k = 1,2, ese ee N} ’ 
and even 
A 0 ifx=a2’ 
d ts = > 
ere) {t iff a: 


this is also a distance, however strange it might seem. 


34 1 Sets of Numbers and Metric Spaces 


Now, let EF be any metric space. Given a point x9 € E anda real number p > 0, 
we define the “open ball” centered at xo with radius po as 


B(xo, ep) = {x € E: d(x, x0) < p}. 
Similarly, we define the “closed ball” 

B(xo, p) = {x € E: d(x, x0) < p} 
and the “sphere” 

S(xo, p) = {x € E: d(x, x0) = p}. 


In R, every interval Ja, b[ is an open ball, and every interval [a, b] is a closed 
ball; indeed, 


b b—- — b b—- 
ire 222). spe ee 
2: 2 2 2 


On the other hand, a sphere in R is a set having just two points. 

In R2 (with the Euclidean distance), a ball is a disk; an open ball does not contain 
the external circle, but a closed ball does. A sphere is just the circle. 

If in R* we consider the distance d, defined earlier, an open ball will be a square 
whose sides have an inclination of 45°, having x9 as its central point. A sphere will 
be the perimeter of such a square. 

On the other hand, if we consider the distance d,,., a ball will still be a square, but 
with sides parallel to the cartesian axes. We will often denote a closed ball related 
to this distance by B[x0, pl; if vo = Ga Bae xu); then 


Blao, p] = [xP —p, x0 +p] x-- x LO —p, x0 +p]. 


A somewhat strange situation arises if we consider the distance d (on any set FE). 
We have 


{xo} ifo<1, <= {xo} ifo <1, 
B => B = 
(xo, P) he esi (x0, P) BO dhes i: 
hence 
E\ {xo} ifp=1, 
S(x0, P) = . 
Gor) O.. Sept 


We will now introduce a series of definitions that will be crucial for understand- 
ing the theory we want to develop. 


1.5 Metric Spaces 35 


Definition 1.14 A set U C E is said to be a “neighborhood” of a point xo if there 
exists a 9 > 0 such that B(xo, e) C U; in that case, the point xo is said to be an 
“internal point” of U. The set of all internal points of U is called the “interior” of U 
and is denoted by U. Clearly, we always have the inclusion U CU. Itis said that U 
is an “open” set if it coincides with its interior, i.e., if Vay, 


Here is an example of an open set. 
Theorem 1.15 An open bail is an open set. 


Proof Let U = B(xo, p) be an open ball, and take any point x} € U. We want 

to prove that x; € U , 1.e., x1 is an interior point of U. Choose r > 0 such that 

r <p —d(xo, x1). If we show that B(x;,7r) C U, our proof will be completed. 
For any x € B(x1,1r) we have 


d(x,x0) < d(x, x1) + d(x1, x0) <r+d(x1,x0) Sp, 
so that x € B(xg, ¢). We have thus shown that B(x,,r) C B(xo, p). | 


Examples Let us analyze three particular examples. In the first one, the set U 
coincides with F; in the second one, U is the empty set; in the third one, it is made 
of a single point. 


1. Every point of E is internal to E since every ball is by definition contained in E, 
the universal set. Hence, the interior of E coincides with E, i.e., E = E. This 
means that E is an open set. 

2. The empty set @ cannot have internal points. Hence, its interior, having no 
elements, is the empty set, i.e., @ = O, meaning @ is an open set. 

3. In general, the set U = {xo}, made up of a single point, is not an open set (e.g., in 
IR with the Euclidean distance), but it could be an open set in certain situations, 
i.e., when xo is an “isolated” point of E. This could happen, for instance, if EF = 
N, with the usual distance inherited from R, or when considering the distance d. 


Theorem 1.16 The interior of any set U is an open set. 


Proof If U = @, this is surely true. Assume, then, that U is nonempty, and take any 
point xj € U. Then there exists a p > Osuch that B(x, e) C U. If we show that 
Bix, p) S Uz our proof will be completed, since we will have proved that every 
point x; of U is an internal point of U. 

To prove that B(x1, ep) C U, let x be an element of B(x,, o). Since B(x, p) is 
an open set, there exists anr > 0 such that B(x,r) C B(x, ep). Then, B(x, r) C U, 
showing that x belongs to U. The proof is complete. a 


36 1 Sets of Numbers and Metric Spaces 


The following implication holds: 
UCU => UCH. 


AS a consequence, we see that U is the greatest open set contained in U; indeed, if 
A is an open set and A C U, then A CU. 


Definition 1.17 A point xo is said to be an “adherent point” of a set U if for every 
p > Owe have that B(xo, 0) NU 4 @. The set of all adherent points of U is said 
to be the “closure” of U and is denoted by U. Clearly, we always have the inclusion 
UC U. It is said that U is a “closed” set if it coincides with its closure, 1.e., if 
U=U. 


Here is an example of a closed set. 
Theorem 1.18 A closed ball is a closed set. 


Proof Let U = B(xo, p) be a closed ball. To prove that U Cc U, we will 
equivalently show that CU C CU. This is surely true if U = E, ie., if CU = ©. 
Thus, assume now that CU is nonempty. Take any point x; € CU, ie., such that 
d(x1, x0) > o. We want to prove that x; € CU, i.e., that x1 is not an adherent point 
of U. Chooser > 0 such that r < d(xo, x1) — p. If we show that B(x},r)NU = @, 
our proof will be completed. 

Assume by contradiction that B(x1,7r) B(x, p) # @, and take an x € 
B(x1,r) M B(xo, p). Then 


d(xo, x1) < d(xo0, x) + d(x, x1) < p+r < pt (d(x0, x1) — p) = d(Xo, x1), 
which is clearly impossible. | 


Examples Let us consider again the aforementioned three examples: U = E, U = 
@, and U = {xo}. 


1. Since E is the universal set, every adherent point of FE necessarily belongs to E. 
Hence, the closure of E coincides with E,ie., E = E. This means that E is a 
closed set. 

2. The empty set @ has no adherent points. Indeed, taking any point xo in E, for 
every p > 0 we have that B(x, 0) N@ = @. Hence, the closure of @, having no 
elements at all, is empty, i.e., @ = , meaning @ is a closed set. 

3. The set U = {xo}, made up of a single point, is always a closed set. Indeed, if 
we take any x1 ¢ U, choosing p > O such that p < d(xo, x1), we have that 
B(x, 0) NU = @, thereby demonstrating that x; is not an adherent point of U. 


1.5 Metric Spaces 37 


Theorem 1.19 The closure of any set U is a closed set. 


Proof Let V = U. If V = E, this surely is a closed set. Let us then assume 
that V ~¢ E. We need to show that any adherent point of V belongs to V. By 
contradiction, assume that there exists some x; in V that does not belong to V. 
Since x; ¢ U, there is a p > O such that B(x1, 0) NM U = @. On the other hand, 


since xj € V, we have that B(x,, 0) ON V # @. Take an x € B(x1, 0) NV. Then, 
since B(x;, 0) is an open set, there exists r > O such that B(x,r) C B(x, p). 
Since x € V = U, we have B(x,r) NU # @ and, hence, also B(x1, 0) 1U 4 @,a 
contradiction. | 


The following implication holds: 


As a consequence, we see that U is the smallest closed set containing U: If C is a 
closed set and C D U, thenC DU. 
We will now try to understand the relationships between the notions of interior 
and closure of a set and those between open and closed sets. 
Theorem 1.20 The following identities hold: 
cU=CcU, (CU)=CU. 
Proof Let us prove the first one. First of all notice that 
CU=6 + CU=0 SU=ESU=E & CU=@. 
Assume now that CU 4 @. Then 
xe€CU & Vp>0 B(x,p)NCUF@D 
= Ve>0 Bia,p) ZU 
oe x¢ U 
& xeCu. 
This proves the first identity. Now let V = CU. Then 
V =C(CV) =C(CV) =CU, 


thereby also proving the second identity. a 


38 1 Sets of Numbers and Metric Spaces 


As a consequence of the preceding theorem, 
U=cCu), U=CCU). 
Moreover, we have the following corollary. 


Corollary 1.21 A set is open (closed) if and only if its complementary is closed 
(open). 


Proof If U is open, then U = U, hence CU =CU = CU, so that CU is closed. On 
the other hand, if U is closed, then U = U, hence (CU) = CU = CU, so that CU is 
open. | 


It is possible to prove that the union and the intersection of two open (closed) sets 
is an open (closed) set. The same holds true for an arbitrary finite number of them. 

However, if one considers an infinite number of open sets, it can be proved that 
their union is still an open set, whereas their intersection could not be. For example, 
in R, taking the open sets 


1 1 
An = ke errs ’ 
n+1n+1 


with n € N, their intersection is {0}, which is not an open set. 

Analogously, if one considers an infinite number of closed sets, it can be proved 
that their intersection is still a closed set, whereas their union could not be such. For 
example, in R, taking the closed sets 


1 1 
C= [141 iF 


n+l 7 +1 
with n € N, their union is ] — 1, 1[ , which is not a closed set. 


Definition 1.22 The “boundary” of a set U, denoted by OU, is defined as the 
difference between its closure and its interior, 1.e., 


aU =U\U. 


We should be careful not to put too much trust in our intuition, naturally 
developed in an Euclidean world. For example, it is true in R that 


B(ao, p) = B(x, p), 9B(ao, p) = S(ao, p). 


1.5 Metric Spaces 39 


However, these identities are not valid in any metric space E. For instance, if we 
take the previously defined distance d , then B(x, 1) = {xo}, which is a closed set, 
and B(xo, 1) = E, so that B(xo, 1) FH B(xo, 1). Moreover, 3 B(xo, 1) = @, whereas 
S(xo, 1) = E \ {xo}, so that 0 B(xo, 1) € S(xo, 1). 

As a curious example, in R, taking U = Q, we have 


® | 
| Check for | 


updates 


In this chapter we introduce one of the most important concepts in mathematical 
analysis: the “continuity” of a function. This topic will be treated in the general 
framework of metric spaces. 


2.1 Continuous Functions 


Intuitively, a function f is “continuous” if the value f(x) varies gradually when x 
varies in the domain, in other words, if we encounter no sudden variations in the 
values of the function. In order to make this intuitive idea rigorous enough, it will 
be convenient to focus our attention at a point xo of the domain and to clarify what 
we mean by 


f is “continuous” at xo . 
We will proceed gradually. 


First Attempt We will say that f is “continuous” at x9 when the following 
statement holds: 


If x is near xo, then f(x) is near f (xo). 


We immediately observe that, although the idea of continuity is already quite 
well formulated, the preceding proposition is not an acceptable definition, because 
the word “near,” which appears twice, does not have a precise meaning. However, 
first of all, to measure how close x is to xp and how close f(x) is to f (xo), we need 
to introduce distances. More precisely, we will have to assume that the domain and 
the codomain of the function are metric spaces. 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 4) 
A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978- 3-03 1-23713-3_2 


42 2 Continuity 


Let, then, E and F be two metric spaces, with their distances dg and dr, 
respectively. Let xo be a point in EF and f : E — F be our function. Let us make a 
second attempt at a definition. 


Second Attempt We will say that f is “continuous” at x9 when the following 
statement holds: 


If dg(x, xo) is small, then dr(f (x), f(xo)) is small. 


We immediately realize that the problem encountered in the first attempt at a 
definition has not been solved at all with this second attempt, since now the word 
“small,” which appears twice, has no precise meaning. 

We then ask ourselves: How small do we want the distance dr( f(x), f(xo)) 
to be? What we have in mind is that this distance can be made as small as we 
want (provided that the distance d(x, xo) is small enough, of course). To be able 
to measure it, we will then introduce a positive real number, which we call ¢, and 
we will require dr( f(x), f(xo)) to be smaller than ¢ when d(x, xo) is sufficiently 
small. The arbitrariness of this positive number ¢ will allow us to take it as small as 
we like. 


Third Attempt We will say that f is “continuous” at x9 when the following 
statement holds true: Taking any number e > 0, 


if dg (x, xo) is small, then dr(f (x), f(x0)) < &. 


Now the word “small” appears only once, whereas the distance dr (f(x), f(xo)) 
is simply controlled by the number ¢. Hence, at least the second part of the 
proposition now has a precise meaning. We could then try to do the same for the 
distance d(x, x0), introducing a new positive number, which we call 5, so we can 
control it. 


Fourth Attempt (the Good One!) We will say that f is “continuous” at x9 when 
the following statement holds true: Taking any number ¢ > 0, it is possible to find 
a number 5 > 0 for which, 


if dp(x, x0) < 6, then dr( f(x), f(xo)) < &. 
This last proposition, unlike the previous ones, contains no inaccurate words. 


The distances dz (x, xo) anddr(f (x), f(xo)) are now simply controlled by the two 
positive numbers 6 and «, respectively. Let us rewrite it in a formal way. 


Definition 2.1 We will say that f is “continuous” at xo if, for any positive number 
€, there exists a positive number 6 such that, if x is any element in the domain E 
whose distance from xo is less than 6, then the distance of f(x) from f (xo) is less 
than ¢. In symbols: 


Ve>0O 35>0: Wx €E dglx,x0) <6 > dr(f(x), f(xo)) <e. 


Rather often, in this formulation, “Vx € E” will be tacitly understood. 


2.1 Continuous Functions 43 


Let us note that one or both of the inequalities 


d(x, x0) <6, ar (f(x), f(xo)) < € 


may be replaced, respectively, by 
dg(x,x0) <6, ar (f(x), f%o)) < € 


without changing the definition at all. This is due to the fact that, on the one hand, 
€ iS any positive number and, on the other hand, that if the implication holds for 
some positive number 4, then it holds a fortiori taking instead of that 6 any smaller 
positive number. 

Reading again the definition of continuity we see that f is continuous at xo if and 
only if 


Ve>0 35>0: f(B(xo,5)) © BUf (x0), €)- 


Moreover, it is equivalent if we take a closed ball instead of an open ball, one or 
both of them. Also equivalently we can say that f is continuous at xo if and only if 


for every neighborhood V of f (xo) 
there exists a neighborhood U of xo 


such that f(U) CV. 


In what follows, we will often denote the distances in E and F' simply by d. We 
are confident that this will not create confusion. 

When the function f happens to be continuous at all points x9 of its domain E, 
we will say that “f is continuous on E” or simply “f is continuous.” 

Let us provide a few examples. 


Example I The constant function: For some c € F we have that f(x) = c, for 
every x € E. Since dr(f (x), f(xo)) = dr(c,c) = O for every x € E, sucha 
function is clearly continuous (every choice of 5 > 0 is fine). 


Example 2 Let xo be an “isolated point” of E, meaning there exists a p > 0 such 
that there are no points of E whose distance from xo is less than p, except xo 
itself. We can then see that, in this case, any function f : E — F is continuous 
at xo. Indeed, for any ¢ > 0, taking 6 = p, we will have B(xo, 5) = {xo}, so 


f(B(x0, 6)) = {f(x0)} S BF (x0), €)- 


Example 3 Let E = R™ and F = RN. Fora fixed number a € R, let us consider 
the function f : RY — RY defined as f(x) = ax. This is a continuous function. 
Indeed, if a = O, then we have a constant function with value 0, and we know 
already that such a function is continuous. Assume, in contrast, that a 4 0. Then, 


44 2 Continuity 


once €¢ > O has been fixed, since 


I| f(a) — f(ao)|| = lax — wxol] = lla(@ — %o)|| = |e la — xoll , 


it is sufficient to take 6 = Tal to verify that 


|x — zo <5 => |lf(@) — f(@o)ll <e. 
Example 4 Let E = R™ and F = R. Let us show that the function f : RY > R 


defined as f(x) = ||2|| is continuous on IR. This fact will be a simple consequence 
of the inequality 


[|x|] — |la’ll| < lla — a’, 


which we will now prove. We have 


Ix |] = (ae — x’) + w"|| < x — ax’ + Ix", 
||’ || = |! — x) + xl] < lx’ — || + [lx]. 
Since ||x — x’|| = ||a’ — x||, we have 
|||] — x’ < la —a'|| and |jx"|| — lll] < la— a’ |, 


whence the inequality we wanted to prove. Now, considering any point ap € RY, 
once € > 0 has been fixed, it is sufficient to take 5 = ¢ to verify that 


lz — woll <5 = |llal| — [lao] <e. 
Example 5 Let E be any metric space, and yo € E be fixed. The function f : E > 


R defined as f(x) = d(x, yo) is continuous. The proof of this fact is similar to the 
earlier one, since we can show that, for any xo € E, 


|d(x, yo) — d(xo, yo)| < d(x, x0). 


Example 6 Let E = R and F = R. Consider the “sign function” f : R >— R 
defined as 


—-lifx <0 
f@)=40 ifx=0 
1 ifx>0. 


We can show that this function is continuous at all points except at x9 = 0. Indeed, 
if x9 ~ O, then it will be sufficient to take 5 < |xo|, so as to have f constant 


2.1 Continuous Functions 45 


on the interval ]xo — 5, xo + 6[ and, hence, continuous at xo. To see that f is not 
continuous at 0, let us fix an e € JO, 1[; for any choice of 5 > 0, it is possible to find 
an x €]—6, 6[ such that | f(x)| = 1, hence | f(x) — f(O)| > e. 


Example 7 The “Dirichlet function” D : R — R is defined as 


D(x) = ti iixeQ, 

0 ifx €Q. 
It can be seen that, for any x9 € R, this function is not continuous at xo. Indeed, 
fixing e € JO, I[, since both Q and R \ Q are dense in R, for every x and any choice 
of 6 > O there will surely be a rational number x’ and an irrational number x” in 
]xo — 6, xo + 6[; hence, based on xo being rational or irrational, we will have that 
either |D(x”) — D(xo)| > € or |D(x’') — D(x)| > e€. 


Let us study the behavior of continuity with respect to the sum of two functions 
and to the product with a constant. In the following theorem, we assume that F is a 
normed vector space. 


Theorem 2.2 Let F be a normed vector space and a a real constant. If f, g : E > 
F are continuous at xo, then the same is true of f + g andaf. 


Proof Let ¢ > 0 be fixed. By the continuity of f and g there exist 6; > 0 and 
62 > O such that 


d(x, x9) < 51 => If) — foll <e, 
d(x, x0) < 62 = |g) — g(o)ll < €. 


Hence, taking 6 = min{d,, 62}, we have that, if d(x, xo) < 6, then 


IF £ a)x) — (fF Ee) Godll < FG) — Fo)Il + lg) — g(xo)ll < 2 
and 

af (x) — @f) Go) ll s lal IlFQ) — F@o)ll < lele. 
By the arbitrariness of ¢, the statement is proved. a 
Remark 2.3 The conclusion of the preceding proof is correct since the ¢ > 0 in the 
definition of continuity is arbitrary. Indeed, even if, for some constant c > 0, one 


proves that 


Ve>0 4d>0: Vx EE deg(x,x0) <6 => dr(f(s), f(xo)) < ce, 


46 2 Continuity 


this is sufficient to conclude that f is continuous at xo. This observation will often 
be used in what follows. 


We now state some properties of continuous functions with codomain F = R. 
Theorem 2.4 /f f, g : E — Rare continuous at xo, the same is true of f - g. 
Proof Let « > 0 be fixed. It is not restrictive to assume ¢ < 1, since we could 


always define e’ = min{e, 1} and proceed with e’ instead of ¢. By the continuity of 
f and g there exist 6; > O and 62 > O such that 


d(x,x0) < 61 => |f(x)— fo)| <e, 
d(x,x0) < 62 => |g(x) — g(x0)| <e. 


Here we note that, since e < 1, if | f(x) — f(xo)| < ¢, then | f(x)| < |f(xo)| + 1. 
Hence, taking 6 = min{6,, 62}, we have 


d(x, x0) <8 = |(f-8)@) — (f+ 8)Go)l = 
= |f@)s@) — f(x)g(xo) + f@x)8o) — f(%0)8 (Xo)! 
< If @)|- lg) — go)! + |g@o)l- | FG) — fo) 
< UF @o)| + D+ 1g) — go)! + gol -1FQ) — fo) 
< (If (0)! + lg@o)| + De. 


By the arbitrariness of ¢, this proves that f - g is continuous at xo. a 
We now state the property of sign permanence. 


Theorem 2.5 If g : E — R is continuous at xo and g(xo) > 0, then there exists a 
neighborhood U of xo such that 


xeU => g(x)>0. 
Proof Let us fix ¢ = g(xo). By continuity, there exists 6 > 0 such that 
d(x,x0) <5 => g(x0) —€ < g(x) < g(xo) +e => 0 < g(x) < 2g8(x0). 
Then U = B(xo, 4) is the neighborhood we are looking for. | 


Clearly enough, if g(xo) < O were true, then there would exist a neighborhood 
U of xo such that 


xeEeU => g(x) <0. 


2.1 Continuous Functions 47 


Theorem 2.6 /f f, g : E — R are continuous at x9 and g(xo) # 0, then also f is 
continuous at xo. 


Proof Notice that, by the property of sign permanence, there exists a neighborhood 
U of xo such that the quotient LY pate or 


a) is defined at least for all x € U. Since ene eb 
it will suffice to prove that ; is continuous at xo. Let us fix ¢ > 0; we may assume 


without loss of generality that e < IgGl By the continuity of g there existsad > 0 
such that 


d(x, x0) <5 = [g(x) — g@o)| <€. 


Since € < leo then also 


|g(xo)| 


d(x, x0) <3 => |g@)| > [gol -—@ > —, 


Asa consequence, 


1 1 |g(xo) — g(x)| 2 
d(x, r) =< = Torry l'a Cealh , 
Se Ro a) Ie@nlla@ol ~ leGoe- 


By the arbitrariness of ¢, this proves that : is continuous at xo. | 


We know that all constant functions are continuous, as is the function f : R > R 
defined as f(x) = x (see Example 3 presented earlier, with a = 1). By the 
previous theorems, all polynomial functions are continuous, as are all rational 
functions, defined by the quotient of two polynomials. More precisely, these latter 
are continuous at all points where they are defined, i.e., where the denominator is 
not equal to zero. 

Let us now examine the behavior of a composition of continuous functions. In 
the following theorem, EF, F,, and G are three metric spaces. 


Theorem 2.7 Let f : E — F be continuous at xo and g : F — G be continuous 
at f (xo); then g o f is continuous at xo. 


Proof Let W be a fixed neighborhood of [go f](xo) = g(f (xo)). By the continuity 
of g at f (xo) there exists a neighborhood V of f (xo) such that g(V) C W. Then, by 
the continuity of f at xo, there exists a neighborhood U of xo such that f(U) C V. 
Hence, [g o f](U) C W, thereby proving the statement. | 


Let us now define, forevery k = 1,2,..., N,, the “kth projection” px : RY >R 
as 


Pk(X1, 42, -.. , XN) = Xp. 


48 2 Continuity 


Theorem 2.8 The functions px are continuous. 


Proof We consider a point 29 = Gs ce aed ae) € RN and fix ane > 0. Notice 
that, for every © = (x1, x2, ...,xXN) € RY, 


N 
Dj — x9)? = d(@, a9); 


j=l 


0 
[xp —xz| S 


hence, taking 6 = e, we have that 
d(x, xo) <5 => |pe(a@) — pe(wo)| = lxx — xy <e. 
This proves that pz is continuous at Zo. | 
Let us consider now a function f : E > R™ for some integer M > 1. We can 


define the “components” of f as f, = pro ff: E > R, withk = 1,2,...,M, so 
that 


FQ) = (fi@), fa), ---. fu)). 


Theorem 2.9 The function f is continuous at xo if and only if all its components 
are as well. 


Proof If f is continuous at xo, then its components are also continuous since they 
are composed of two continuous functions. To prove the contrary, let us assume 


that all the components of f are continuous at xg. Fixing e« > 0, for every k = 
1,2,..., M there is a 5, > O such that 


d(x,x0) < 5k => |fe(x) — fe(xo)| < &. 


Setting 6 = min{d1, 62,..., dy}, we have 


M 
Dif) — filo)? < VMe. 


j=l 


d(x, x0) <6 = d(f(x), f(x0)) = 


By the arbitrariness of ¢, the proof is complete. a 
Theorem 2.10 Every linear function @: RN — R™ is continuous. 


Proof We first observe that, since the projections px are linear functions, the com- 
ponents £; = px o £ of the linear function @ are linear as well. Let [e;, €2,..., En] 


2.1 Continuous Functions 49 


be the canonical basis of R%, ie., 


e; = (1,0,0,...,0), 
€2 = (0,1,0,...,0), 


en = (0,0,0,..., 1). 
Every vector @ = (x1, X2,...,XN) € RY can be expressed as 
w= xe; +x2€2 +--+ +xNen = pi(@)E] + p2(@)e2 +--+ py(H)en . 
Hence, for every k € {1,2,..., M}, 
€x(@) = pi(x)ex(e1) + p2(a@)lx(€2) +--- + pn(xyex(en), 


showing that £; is a linear combination of the projections pi, p2,..., pn. Since 
those functions are continuous, we have proved that £; is continuous for every k € 
{1,2,..., M}. Therefore, since all its components are continuous, the function £ is 
continuous as well. a 


We conclude this section with a characterization of continuity involving the 
counterimages of open and closed sets in arbitrary metric spaces. 


Theorem 2.11 The following propositions are equivalent: 


(i) f : E > F is continuous. 
(ii) If A is open in F, then f—'(A) is open in E. 
(iii) If C is closed in F, then f° (6) is closed in E. 


Proof Let us show that (i) implies (ii). Let f : E — F be continuous, and let A 
be an open set in F. Taking x9 € f~!(A), we have f(xo) € A. Since A is open, 
there exists ao > 0 for which B(f (xo), 0) C A. Since f is continuous at xo, taking 
€ = p in the definition, there exists a 5 > 0 such that f(B(xo, 6)) C B(f (x0), p). 
Then B(xo, 5) © f—'(B(f Go), pyc f TAY: so that xg is in the interior of 
i as (A). We have thus proved that every xo € ire (A) is in the interior of ipa (A), 
so that f—!(A) is open. 

Let us prove now that (ii) implies (i). We consider a point x9 € E, fixe > 0, and 
set A = B(f (xo), €), which is an open set in F’. If (i7) holds, then Fie (A) is an open 
set in E containing xo. Hence, there exists a 6 > O such that B(xo, 5) C ff —l(A), 
meaning f(B(xo, 6)) C A = B(f (x0), €). The continuity of f at xo is thus proved. 


50 2 Continuity 


We now show that (i7) implies (ii7). Let C be a closed set in F, and let A = CC, 
the complementary set of C. The set A is open in F so that, if (ii) holds, then 
f—'(A) is open in E. But f~!(A) = f-'(CC) =Cf-!(C), so f~!(C) is closed. 

In a very similar way one proves that (iii) implies (ii), concluding the proof of 
the theorem. | 


2.2 ‘Intervals and Continuity 
Here is a fundamental property of continuous functions defined on intervals. 


Theorem 2.12 (Bolzano Theorem) /f f : [a,b] — R is a continuous function 
such that 


either f(a)<O0< f(b), or f(a)>O0> f(b), 
then there exists ac € ja, b[ such that f (c) = 0. 


Proof We treat the case f(a) < 0 < f(b), since the other one is completely 
analogous. We set Jo = [a,b] and consider the midpoint we of the interval Jo. 
If f is equal to zero at that point, then we have found the point c we were looking 
for. Otherwise, either f(42) < 0 or f(4%) > 0. If f(42) < 0, then we call 


2 
I, the interval [o, bj; if f (44) > 0, then instead we refer to /; as the interval 


[a, a). Taking now the midpoint of /; and following the same reasoning, we can 
define an interval /> and, by recurrence, a sequence of intervals J, = [ay, by] such 


that 


IbplD Nh Dh2Dk2Q... 


and f(an) < 0 < f(b,) for every n. By Cantor’s Theorem 1.9, there exists a 
c € R belonging to all these intervals. Let us prove that f(c) = 0. By contradiction, 
assume f(c) 4 0. If f(c) < 0, by the property of sign permanence, there is a 5 > 0 
such that f(x) < 0 for every x € Jc — 6,c + 6[. Now, since b, — c < by — dy and 
by — An = sae < ba forn > 1, taking n > bea we have b, € Jc — 6,c + 6[. But 
then we should have f(b,) < 0, in contradiction to the above inequality. A similar 
line of reasoning rules out the case f(c) > 0. a 


As aconsequence of the foregoing theorem, we deduce that a continuous function 
“transforms intervals into intervals.” 


Corollary 2.13 Let E be a subset of Rand f : E > R be a continuous function. 
If I © E is an interval, then f (1) is also an interval. 


2.3. Monotone Functions 51 


Proof Excluding the trivial cases where J and f (J) are made of a single element, 
let us take a, 6B € f(Z), witha < B, and let y be such that a < y < 6. We want to 
see that y € f(/). Let g : E — R be the function defined by 


g(x) = f@)—y. 


We can find a,b in J such that f(a) = @ and f(b) = 8B. Since J is an interval, 
the function g is defined on [a, b] (or [b, a] in the case b < a), and it is continuous 
there. Moreover, g(a) < 0 < g(b), and, hence, by the foregoing theorem, there is a 
c € Ja, b[ such that g(c) = 0,ie., f(c) = y. a 


2.3. Monotone Functions 

Let E be a subset of R. We will say that a function f : E > Ris: 
“Increasing” if [x1 <x. => f(x1) < f(%2) J. 

“Decreasing” if [x1 <x2 => f(x1) => f(x2) ]. 

“Strictly increasing” if [x1 < x2 => f(x1) < f(x) J. 


“Strictly decreasing” if[ x1; <x. => f(x1) > f(%2) J. 


We will say that f is “monotone” if it is either increasing or decreasing and “strictly 
monotone” if it is either strictly increasing or strictly decreasing. 


Example The function f : [0,+oo[— R defined as f(x) = x" is strictly 
increasing. The case n = 2 was established in Lemma 1.6. The general case can 


be easily proved by induction. 


Let us now show how one can characterize the continuity of invertible functions 
defined on an interval. 


Theorem 2.14 Let I and J be two intervals, and let f : I > J be an invertible 
function. Then 


f is continuous <> ff is strictly monotone. 
In that case, f~' : J — 1 is also strictly monotone and continuous. 


Proof Assume f to be continuous and, by contradiction, that it is not strictly 
monotone. Then there exist xj < x2 < x3 in J such that either 


f(x.) < f(x2) and f(x2) > f(x3) 


52 2 Continuity 


or 


f(x) > f(v2) and f(x2) < f(x3). 


(Equalities are not allowed since f is injective.) Let us consider the first case, the 
other being analogous. Choosing y € R such that f(x1) < y < f(x2) and f (x2) > 
y > f (x3), by Corollary 2.13 there exist a € ]x1, x2[ and b € x2, x3[ such that 
f@ = y = f(b), in contradiction to the injectivity of f. 

Assume now that f is strictly monotone, e.g., strictly increasing, the other case 
being analogous. Once we have fixed some x9 € J, we want to prove that f is 
continuous at xo. Let us consider three distinct cases. 


Case 1. Assume that xo is not an endpoint of interval J and, consequently , yo = 
J (Xo) is not an endpoint of J. Let e > 0 be fixed; we can assume without 
loss of generality that [yo — €, yo + e] C J. Set x, = f7'Ovo — e) and 
x2= f~'Qo+te), and notice that xj < x9 < x2. Since f(x1) = f(x0)—-€ 
and f (x2) = f(xo) + ¢, taking 6 = min{xo — x1, x2 — xo} we have 

d(x,x0)<6 => x1 <x <x 
=> f(x) < f(x) < f(x) 
=> d(f(x), fo) <€, 
showing that f is continuous at xo. 

Case 2. Now let x9 = min/, hence also yo = min J. Let € > 0 be fixed; we can 

assume without loss of generality that [yo, yo+e] C J. Set, as previously, 


x2 = f—'(yo + €). Since f(x2) = f(xo) + ¢, taking 6 = x2 — xo, we 
have 


xo SX < x2 => f(xo) < f) < fQ2) > df), fo) <é, 


demonstrating that f is continuous at xo. 
Case 3. If xo = max /, then the argument is similar to that in Case 2. 


Finally, we observe that 


f strictly increasing => f =d strictly increasing , 


Ff strictly decreasing => f a strictly decreasing . 


Therefore, if f is strictly monotone, then so is f~!. Hence, since f~! : J > J is 
invertible and strictly monotone, as proved earlier, it is necessarily continuous. MH 


2.4 The Exponential Function 53 


2.4 The Exponential Function 

Let us denote by R the set of positive real numbers, i.e., 
Ry =]0, +oo[={x €R:x>0}. 

The following theorem will be proved in Chap. 5. 


Theorem 2.15 Given a > 0, there exists a unique continuous function fg : R > 
Ry such that 


(i) fa(x1 + x2) = fa(%1) fa (x2) , for every x1, x2 in R. 
(ii) fad) =a. 


Moreover, if a # 1, then this function fa is invertible. 
The function f; is called “exponential to base a” and is denoted by exp,. If 
a + 1, then the inverse function i : R4 — Ris called the “logarithm to base a” 


and is denoted by log. By Theorem 2.14, it is a continuous function. We can write, 
forx € Randy € R4, 


exp, (x) =y PF x= log, (y). 
From the properties 


(i) Expy (x1 + x2) = exp, (11) Exp, (x2), 
(ii) exp,(1) =a 


we directly deduce the corresponding properties of the logarithm 


(J) loga(v1y2) = loga (v1) + logg(y2) , 
(ij) loga(a) = 1. 


Since the constant function f(x) = 1 verifies (@) and (i), with a = 1, by the 
uniqueness of that function we deduce that f = exp), i.e., 


expj(x)=1, foreveryxeR. 
Let us now deduce from (i) and (ii) some general properties of the expo- 
nential function. First of all, we observe that, since exp,(1) = exp,(1 + 0) = 


exp, (1) exp, (0), it must be that 


exp, (0) = 1. 


54 2 Continuity 
Let us now prove that, for every x € R and everyn € N, 
exp, (nx) = (exp,(x))”. 
We argue by induction. If n = 0, then we see that 
exp, (0x) =1, (exp, (x))? = 1, 


hence the identity surely holds. Assume now that the formula is true for somen € N. 
Then 


exp, ((n + 1)x) = exp, (nx + x) = exp, (nx) exp, (x) 
(exp; @))" ext (xi exp)" 


hence it is true also for n + 1. The proof of the formula is thus completed. 
Taking x = 1, we see that 


exp, (n) =a” 
for every n € N. This fact motivates a new notation: For every x € R, we will often 


write a* instead of exp, (x). 
Taking n € N \ {0} and writing 


a = exp, (1) = exp, ¢ ~) = (cx, (<)) j 


we see that exp, (4) is that number u € Rx that solves the equation u” = a. Such 


a number u is called the “nth root of a,” and it is denoted by u = 4/a, ie., 
(7) 
exp, (—] = Ya. 
n 
Hence, if m € N andn € N \ {0}, 


m 1 
exp, (=) = exp, m— 


On the other hand, writing 


ll 
a ™~ 
oO 
Pal 
2 
a 
Sle 
St 
nn 

3 
ll 
— 
= 
Ss 
3 


1 = exp, (0) = exp, (x — x) = exp, (x) exp, (—x), 


we see that 


exp, (—x) = foreveryx ER. 


exp, (x) , 


2.4 The Exponential Function 55 


In particular, if m € N andn € N \ {0}, 


- 1 1 
XP (=) ar ee = Wa. 


~ exp, () (“fay ~ 
Let us check for a moment that, for every m € Z andn € N \ {0}, we have 


(wa) = Ya. 


Indeed, if b = %Ya, then a” = (b")”" = b”™ = (b)", whence b” = V/a™. 
We can thus conclude that 


m n/a m 
eXP, (=) = a", for every | €Q. 


If a # 1, then the exponential function exp, : R — Ry, is continuous and 
invertible and, hence, strictly monotone. Since exp, (0) = 1 andexp, (1) =a, 


: striclty increasing ifa > 1; 
exp, is: . ; : 
striclty decreasing if0 <a <1. 


(See Fig. 2.1) Let us also emphasize the following three important formulas: 


1\* 1 = , 

(ab)* =a*b’, —-) =—=a"*, (d)y=a”. 

a 

The first one follows from the fact that the function f(x) = a*b* verifies the 
property 7) and f(1) = ab, hence f = exp,,. Analogously, for the second one, 
we take f(x) = a. For the third one, simply take f(x) = a?*. 


ex 
Pa exp, 


Fig. 2.1 The function exp,, witha > 1 anda < 1, respectively 


56 2 Continuity 


log, 


Fig. 2.2 The function log,, witha > 1 anda < 1, respectively 


Ifa ¢ 1, we have that 


._ | Strictly increasing ifa > 1, 
log, is : a 
strictly decreasing if0 <a < 1. 
(See Fig. 2.2) The following are two important formulas for the logarithm: 


log, (x) 
log, (b) 


log, (x) = ylog,(x), log, (x) = 


To prove the first one, set u = log, (x”) and v = log, (x). Thena” = x” anda” = x, 
and hence a“ = (a’)” = a”. By the injectivity of exp,, it must be that u = vy, 
and the first formula is proved. To prove the second formula, set u = log,(x), 
v = log, (x), and w = log, (b). Then b“ = x, a” = x, anda” = b, whence 
a” = (a”)" = a™". By injectivity, it must be that v = wu, and the second formula 
is also proved. 


2.5 The Trigonometric Functions 
We will now introduce the trigonometric functions following a path similar to that 
traced previously for the exponential function. 

Given a real number T > 0, a function F : R > £2, where 2 is any possible 
set, is said to be “periodic with period T,” or “T-periodic” short, if 


F(x+T)=F(x), foreveryx ER. 


Clearly, if T is a period for the function F, then 27, 37, ... are also periods for F. 
We will say that T is the “minimal period” if there are no smaller periods. 


2.5 The Trigonometric Functions 57 


Let us introduce the set 
Slb={zeC:kz/=]j, 


i.e., the circle centered at the origin, with radius 1, in the complex field C. 
The following theorem will be proved in Chap. 5. 


Theorem 2.16 Given T > 0, there exists a unique function hr : R —> S', 
continuous and periodic with minimal period T, such that 


(i) hr (x, +x2) = hr (xh (x2), for every x1, x2 inR. 
(ii) hr (4) =i. 

The function 7 is called the “circular function to base T.” Since S! is indeed a 
subset of R?, the function hy has two components, which will be denoted by cosr 
and sin7y; they will be called “cosine to base T” and “sine to base T,” respectively. 
We can then write, for every x € R, 


hr(x) = (cosr(x), sinr(x)), or Ar(x) =cosr(x) +i sinz(x). 


These functions are T-periodic, and from the properties of the circular function we 
have that 


(a) (cosr(x))* + (sinr (x)? = 1. 
(b) cosz (x1 + x2) = cos (x1) cosr (x2) — sinr (x1) sin (x2) . 
(c) sinr (x1 + x2) = sing (x1) cosr (x2) + cos (x1) sing (x2). 
(d) cosy (4) =0,  sinr (4) =1. 
Let us now focus our attention on the interval [0, 7[ . Writing 

i =hr (4) =hr (0+ 4) = hr Ohr (4) = hr Oi, 
we see that hy (0) = 1. Moreover, 

T T,T T T “2 
hr (3) =hr (q+ 4) =hr (a) Ar (zg) =P =-1, 


whereas 


her (3) = hr (E+ 4) = her (5) hr (8) = (DE = i. 


58 2 Continuity 


Summing up, 


{o) 
jo) 
Dn 
S 
= 
j=) 
SS 
ll 
— 
a. 
5 
S 
= 
oO 
~ 
ll 
oO 


(eo) 

° 

an 

Q 

+ 
Sena 

ll 

oO 

2 

5 

SN 
os 

als —— a 

ll 

lou 
| . 

— 


Now, from 
1 =h7(O) = hr — x) = hr(x)hr(—-x) 
we have hr(—x) = hr(x)7! = hr(x)*, being |i7(x)| = 1. Hence, 
cosr(—x) = cos7r(x), sinr(—x) = —sin7z(x), 


showing that cos; is an even function, whereas sin; is odd. 

Let us prove now that hr : [0, T[— S!, the restriction of hy to the interval 
[0, 7, is bijective. First, injectivity: Take a < 6 in [0, T[. By contradiction, if 
hr(a) = hr(), then 


hr(B) = 
hr (@) 


hr (B — a) = hr(B)hr(—a) = 


’ 


and hence 
hr(x+(B-—a@)) =hAr(x)hr(B —a) =hr(x),  foreveryx ER, 


so that 6 — a would be a period for hy smaller than T, while we know that T is the 
minimal period. Then hr (a) 4 hr (A), proving that hy is injective. 
We now prove that 


>0 if0<x<F 


cos7(x) {<0 if 5 <x< ee sinz (x) 


>0 if0<x<4F 


<0 (i ae aw ae 
>0 fee ex eT 


(See Fig. 2.3) For example, if x € ]0, a , then surely it cannot be that siny (x) = 0, 
otherwise h(x) would coincide with either 7(0) or hr (4), contradicting the 
already proved injectivity. Then, by continuity (as a consequence of Bolzano’s 
Theorem 2.12), siny must be either always positive or always negative on ]0, FI. 
Since siny ( t) = 1, it needs to be always positive on that interval. 


2.6 Other Examples of Continuous Functions 59 


cosr sinr 


Fig. 2.3 The trigonometric functions cos and sin 


Now, to conclude, let us prove that hr: [0, T[ > S! is surjective. Take a point 
P= (X, X2)in S!. Notice that X; € {[—1, 1]. The two cases X; = —land X; = 1 
imply X2 = 0, and we already know that hr (5) = (—1,0) and h7(O) = (1,0). 
Assume, then, that X; €] — 1, 1[. Since cosy (0) = 1, cosr (5) = —1, and cosr7 is 
continuous, by Bolzano’s Theorem 2.12 there is a x € ]0, $l such that cosr (x) = 
X 1. Then 


| sinr (x)| = V1 — (cosr(x))? = of 1 — Xf = |X|. 


We have two possibilities: Either sinr(x) = X2, in which case h7(x) = P, or 
sin7 (x) = —X2, and 


hr (T — xX) =hr(—x) = hr (x)* = (X1, —X2)* = (X11, X2) = P. 


Since T —x € 15, T[, we have proved that hr is surjective. 


2.6 Other Examples of Continuous Functions 


We define the “tangent to base T” as 


sing (x) 


tanz (x) = ae 


Its natural domain is the set {x € R: x F f +k, k € Z}. It is a continuous 
function, and it is periodic, with minimal period 5 (See Fig. 2.4). 
Let us also define the “hyperbolic functions” 
a*+a~* *_q* 


cosha (x) = ——, sinh, (x) = — : 


60 2 Continuity 


i i 
i] i 
: tanr ' 
i] 1 
i i 
i] 1 
i i] 
i] 1 
1 1 
1 1 
1 1 
1 1 
i 1 
i 1 
_f T T' z 
2 4! 0 4! 2 
i 1 
i 1 
i i 
i i 
1 1 
i] i 
i i 
1 1 
i 1 
i 1 
i 1 
i 1 
i 1 
i 1 
i 1 
i 1 
i 1 
i 1 
i 1 
i 1 
id 1 
i 1 
1 i 
Fig. 2.4 The trigonometric function tan 

cosh, 

1 

0 


Fig. 2.5 The hyperbolic function coshg 


where a > 0 is fixed (Figs. 2.5 and 2.6). The “hyperbolic cosine” and “hyperbolic 
sine” to base a are continuous functions, and it can be verified that they satisfy the 
following identities: 


(a) (coshg(x))* — (sinhg(x))* = 1. 
(b) coshg (x1 + x2) = coshg (x1) coshg(x2) + sinhg (x1) sinhg (x2) . 
(c) sinhg (x; + x2) = sinhg (x1) coshg (x2) + coshg (x1) sinhg (x2). 


2.6 Other Examples of Continuous Functions 61 


sinh, 


Fig. 2.6 The hyperbolic function sinh, when a > | anda < 1, respectively 


tanh, tanh, 


Fig. 2.7 The hyperbolic function tanhy when a > 1 anda < 1, respectively 


The striking analogies with the trigonometric functions can be explained recalling 
the similar properties of the exponential and circular functions. They will be further 
investigated in Sect. 8.4.2. 

We can now define the “hyperbolic tangent” to base a as 


sinhg (x 
tanh, (x) = sale) : 
cosh, (x) 
Its domain is the whole real line R, and it is continuous (Fig. 2.7). 
Let us now consider two examples of functions and examine their continuity. 


Example 1 Let f : R > R be defined as 


1 
f= siny (<) ifx #0, 
0 ifx =O. 


If xo 4 O, then the function f is continuous at xo, since it is the composition 
of continuous functions. In contrast, if x9 = 0, then it is not continuous at xo since 
in every neighborhood of 0 there are values of x for which f(x) = 1, whereas 


f(0) =0. 


62 2 Continuity 


Example 2: Now let f : R > R be defined by 


1 
i — if 0, 
fi= xsinr (=) ifx ~ 
0 ifx =0. 


This function is continuous on the whole R. Indeed, if x9 4 0, then the situation 
is similar to the previously described one. If now x9 = O, then it is useful to 
observe that 

|f(x)| <|x|,  foreveryx eR. 


Thus, once ¢ > 0 is fixed, it is sufficient to choose 5 = € to have that 


Ix—O| <6 => |f@)— fO|<e. 


Check for 
updates 


We will now introduce another fundamental concept that, however, is strongly 
related to continuity. It is the notion of the “limit” of a function, a local notion, 
as we will see. As in Chap. 2, the theory will be developed within the framework of 
metric spaces. 


3.1 The Notion of Limit 


Our general setting involves two metric spaces, FE and F, a point xo of FE, anda 
function 


f:E>F or f:E\{x}-> F, 
not necessarily defined in xo. 


Definition 3.1 If there exists / € F such that the function f : E > F, defined by 


f(x) ifx Axo, 


i ifx = x9, 


f@= | 


is continuous at xo, then / is said to be a “limit of f at xo,” or also a “limit of f(x) 
as x tends to xo,” and we can write 


l= lim f(x). 


x—>XxQ 


In other terms, / is a limit of f at xo if and only if 
Ve>0O 4d>0: Vx EE O<dg(x,x0) <6 => dr(f(x),) <e, 
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 63 


A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978- 3-03 1-23713-3_3 


64 3 Limits 


or, equivalently, 
VV neighborhood of | AU neighborhood of x9: f(U \ {xo}) C V. 


Sometimes we can also write “ f(x) > las x > x0.” 


We know that if xo happens to be an isolated point, then the function f defined 
earlier will be continuous at xo for every! € F. Therefore, the notion of limit is 
of no interest at all in this case. This is why we will always assume that xg is not 
an isolated point, in which case we say that x9 is a “cluster point” of E: Every 
neighborhood of xg contains some point of E that differs from xo itself. 

Note that if x9 is a cluster point of E, then every neighborhood Up of xo contains 
infinitely many points of E. Indeed, once we have found x; 4 xo in Up, it is possible 
to choose a neighborhood U, of xo that does not contain x;. Then we can find 
x2 # xo in Uj, and so on. 

From now on we will assume that xq is a cluster point of E. This assumption also 
enables us to prove the following proposition. 


Proposition 3.2 [fa limit of f at xo exists, then it is unique. 


Proof Assume by contradiction that there are two limits, / and /’, which are distinct. 
Let us take e = 5d(1, l'). Then there exists a 5 > O such that 


0 <d(x,x9) <6 => d(f(x),) <e, 
and there exists a 5’ > 0 such that 
0 < d(x, x0) < 5 => d(f(x),I') <e. 


Let x # xo be such that d(x, x9) < 6 and d(x, xo) < 6’ (such an x exists because 
Xo is a cluster point). Then 


d(l’,1) < dU, f(x)) +d(f(x),1) <2e =a), 
a contradiction. a 
The following relationship will surely be useful. 
Proposition 3.3. The equivalence 


lim f@~#)=l <s lim d(f(x),J =0 
X>Xo X—> XO 


always holds true. 


3.2 Some Properties of Limits 65 


Proof The function F(x) = d(f(x),/) has real values, and the distance in R is 
d(a, 6) = |a — B|. The conclusion easily follows from the definitions. | 


The following theorem underlines the strong relationship between the concepts 
of limit and continuity. 


Proposition 3.4 For any function f : E > F, 


f is continuous atxo <> lim f(x) = f(xo). 
X>XxQ 


Proof In this case, the function f coincides with f. | 


3.2. Some Properties of Limits 
Let us start with those properties of limits that are directly inherited from those of 
continuous functions. In the following statements, the functions f and g are defined 


on E or E \ {xo}, indifferently, and xo is a cluster point of E. 


Theorem 3.5 Let F be a normed vector space, f, g : E \ {xo} — F two functions 
such that 


l= lim f(x), b= lim g(x), 
xX—>xXx0 x—>x0 


anda € R. Then 


Jim (f@)+e@]=h+h, lim [af](x) =a. 


Assume now that F = R™ for some integer M > 1. For any function f : 
E \ {xo} — R™ we can consider its components f; : E \ {xo} > R, with k = 
1,2,..., M, and we can write 


f(x) = (fi), fox), -- +5 fu (x)). 


Theorem 3.6 The limit lim f(x) = 1 € R™ exists if and only if all the limits 
x—>XxQ 


lim f(x) = Ip € R exist, withk = 1,2,..., M. In that case, l = (I, In, ..., lu), 
x>Xx0 


i.e, 
lim f(x) = ( lim fi(x), lim fo(x),..., lim fu ()) i 
X—>Xx0 x—>x0 X—>xXx09 xXx 

Proof This is a direct consequence of Theorem 2.9 for continuous functions. M& 


We now assume that F = R and state the property of sign permanence. 


66 3 Limits 
Theorem 3.7 Let g: E \ {xo} — R be such that 
a g(x) > 0. 
Then there exists a neighborhood U of xo such that 
x €U\ {xo} > g(x) > 0. 
Similarly, if 
lim g(x) <0, 
x>X0 
then there exists a neighborhood U of xo such that 
xeU\ {xo} > g(x) <0. 
As an immediate consequence, we have the following corollary. 


Corollary 3.8 Jf there is a neighborhood U of xo such that g(x) < 0 for every 
x € U \ {xo}, then, if the limit exists, 


lim g(x) <0. 
x>Xx0 
Similarly, if g(x) => 0 for every x € U \ {xo}, then, if the limit exists, 


lim g(x) >0. 
x—>Xx0 


Still assuming that F = R, we now consider the product and the quotient of two 
functions. 


Theorem 3.9 Let f, g : E \ {xo} — R be such that 
= lim f(x), b= lim g(a). 
x>xQ x>XxQ 
Then 
lim [f(@)g@)] = hh. 
xX—>x0 


Moreover, if lz 4 0, then 


Fee As ec 
x>x%0 g(x) Lb 


3.2 Some Properties of Limits 67 


Now let E be any metric space and F = R. The following theorem has a strange 
name, indeed. 


Theorem 3.10 (Squeeze Theorem) Let F,, F2: E \ {xo} — R be such that 
fim, Fue) = lim Fats) = 1, 
If f : E \ {xo} > R has the property that 
F(x) S f(x) < Fax) foreveryx € E \ {xo}, 
then 
fim, £0) =. 
Proof Once ¢ > 0 has been fixed, there exist 6; > 0 and 52 > 0 such that 


O0<d(x,x0) <6) => l-e<Fi(x)<l+e, 
0<d(x,x0) <6. => l—-ée<Fo(x)<Il+e. 


Taking 6 = min{6,, 62}, we have 
O<d(x,x9) <5 => Il-e<Fi(x) < f(x) < Fox) <l+e, 

thereby completing the proof. | 

As a consequence, we have the following corollary. 
Corollary 3.11 Let f, g : E \ {xo} > R be such that 

Peed f@) =0, 
and there is a constant C > 0 such that 
lg(x)|<C, foreveryx € E \ {xo}. 

Then 


Jim f@)g@x) =0. 


68 3 Limits 


Proof After noticing that 
—CIF@) < f@)s@) = ClF@)|, 
and recalling that, by Proposition 3.3, 
fa) =0 > ie FG) =0, 


the result follows from the Squeeze Theorem 3.10, taking 1 (x) = —C|f(x)| and 
F2(x) = C|f(x)I. a 


Remark 3.12 Returning to the statements of the preceding theorem and corollary, 
we realize that “for every x € E \ {xo}” could be weakened to “for every x ¢ xo in 
a neighborhood of xo.” This is due to the fact that the notion of limit relates only to 
the local behavior of the function near x9. This observation holds in general when 
dealing with limits and will often be used in what follows. 


3.3. Change of Variables in the Limit 
We now return to the general setting in metric spaces and examine the composition 
of two functions, f and g. We have two interesting situations. In the first one, some 
continuity of g is needed. 
Theorem 3.13 Let f : E — F,or f : E \ {xo} — F, be such that 

lim f(x) =1. 

xX>Xx9 


If g : F — Gis continuous atl, then 


Jim af) = 8@), 


lim g(f@)) = g( lim f()). 
xXx—>x0Q X> XQ 


Proof Recalling the definition of limit, we know that the function cd :E => Ris 
continuous at x9, whereas g is continuous at / = f (xo). Hence, g o f is continuous 
at xg, so that, recalling that f(x) = f(x) when x ¥ xo, 

lim g(f(a)) = lim g(f(a)) = g(f@o)) = 80), 

x—>x0 xX—>Xx0 


as we wanted to prove. a 


3.3. Change of Variables in the Limit 69 


The following theorem gives us the change of variables formula. The function 
g does not have to be continuous at the limit point of f, and indeed it could even 
not be defined there. 
Theorem 3.14 Let f : E > F,or f : E \ {xo} > F, be such that 


lim f(x) =1. 
x>x0 


Assume, moreover, that f(x) 4 | for every x # xo in a neighborhood of xo. Let 
g:F > G,org: F \ {l} > G, be such that 


lim g(y)=L. 
youl 


Then 
lim g(f@))=L, 
x> x0 
i.e, 
lim g(f@)) = lim g(y). (3.1) 
x X0 y— lim f (x) 
x x0 


In the preceding formula, we say that the “change of variables y = f(x)” has 
been performed in the limit. 


Proof We first observe that, in view of the assumptions, go f is defined on U \ {xo} 
for some neighborhood U of xo. Moreover, / is a cluster point of F’. Recalling again 
the definition of limit, we know that the function ri : E — F is continuous at xo, 
with f (xo) = /. Similarly, let us introduce the function g : F — G, defined as 


~,  J&y) ify £b, 
w= {8 ify =l. 


This function is continuous at /, so the composition g o a is continuous at xo. For 
every x € U \ {xo}, since f(x) 4/1, we have 


e(f (x) = 8(f(x)) = B(F)), 


and hence 
tim g(f@)) = lim a(f@) = Fo) = L, 


thereby proving the result. | 


70 3 Limits 


3.4 On the Limit of Restrictions 


We have studied some properties of limits in a context where EF and F are metric 
spaces, xo is a cluster point of E, and either f : E > For f : E \ {xo} ~ F. We 
now note that all the aforementioned considerations still hold if we assume that the 
domain of f is a subset of E, say, D C E, provided that xo is a “cluster point of” 
D. By this we mean that every neighborhood of x9 contains some point of D that 
differs from xo itself. Notice that xo might not be an element of D. 

Now let f : E \ {xo} > F, and let E € E. We can consider the restriction of f 
toE Bt {xo}, i.e., the function f: E \ {xo} — F such that f(x) = Ff (x) for every 
xeE \ {xo}. 


Theorem 3.15 /f the limit of f at xo exists and xo is a cluster point of E, then the 
limit of f at xo also exists, and it has the same value: 


lim f(x) = lim f(x). 
x>x0 xX>x0 
Proof The proof follows directly from the definition of - : | 


The previous theorem is often used to establish the nonexistence of the limit of 
f at some point xo, trying to find two restrictions along which the two limits differ. 


Example 1 The function f : R* \ {(0, 0)} > R, defined by 


f( ) = ai 
Xx, = 3 
y x2 y? 


has no limit as (x, y) > (0, 0), since the restrictions of f to the lines E 1={,y): 
x =O} and E> = {(x, y) : x = y} have different limits. 


Example 2: More surprising is the case of the function 


2 
x"y 
x,y ==. 
fx, y) ig 
It can be seen that all its restrictions to the lines passing through (0, 0) have limits 
equal to 0. Indeed, this is easily seen for E, = = {(x, y) : x = 0} and Ep = = {(x, y): 
y = 0}, whereas for any m 4 0, 


3 MX 


mx 
lim : se a arm Be ge om ere 

x0 Fema) >0x44+m2x2 x30 x2 +m? 

However, the restriction to the parabola {(x, y) : y = x} is constantly equal to 5 ; 
thereby leading to a different limit. Hence, the function f has no limit at (0, 0). 


3.4 On the Limit of Restrictions 71 


Example 3 Quite unlike the two preceding examples, let us now prove that 


xy? 
lim Dae = 0 i 
(x,y) > (0,0) x" + y 
Let ¢ > 0 be fixed. After having verified that 
229, 
xy D3 2 
= ee 
ey = 5 +y), 
it is natural to take 6 = V2e, so that 
HGS s6 
x,y), (0, < —O| <e. 
y x4 ye 


Now let E be a subset of R and F be a general metric space. Given f : E > F 
or f : E \ {xo} — F, we can consider the two restrictions fl and f to the sets 
Ei = EN|]—«, xo[ and E> = EM ]xo + o[, respectively. If xo is a cluster point 
of E 1, then we call the “left limit” of f at x9, whenever it exists, the limit of fi (x) 
when x approaches xo (in E 1), and we denote it by 


lim f(x). 


X> XQ 


Analogously, if xo is a cluster point of E>, we call the “right limit” of f at xo, 
whenever it exists, the limit of f2(x) when x approaches xo (in E2), and we denote 
it by 


lim, f(x). 


X—> Xo 
Theorem 3.16 Jf xo is a cluster point of both EQN] — co, xo[ and EN ]xo + cob, 
then the limit of f at xo exists if and only if both the left limit and the right limit 


exist, and they are equal to each other. 


Proof We already know that if the limit of f at xo exists, then all restrictions of f 
must have the same limit at xo. Conversely, let us assume that the left limit and the 
right limit exist and that / € F is their common value. Let ¢ > 0 be fixed. Then 
there exist 6; > 0 and 62 > O such that if x € E, 

xo-d1<x<x09 => d(f(x),) <e 


and 


xo<x<x9t+d2 => d(f(x),) <e. 


72 3 Limits 
Defining 6 = min{d), 52}, we then have that if x € E \ {xo}, 

xo-86<x<x9td => d(f(x),) <e, 
showing that the limit of f at xo exists and is equal to /. | 


Example The “sign function” f : R — R, defined as 


1 ifx >0, 
f@= 0 ifx =0, 
-l ifx <0, 
has no limit at x9 = O since lim f(x) =—Iland lim f(x) =1. 
x07 x>0+ 


3.5 The Extended Real Line 


Let us consider the function g : R > ] — 1, 1[ defined as 


g(x) = 


x 
1+ |x|- 


1 


This is an invertible function, with inverse ~~ :] — 1, I[— R given by 


=i y 
g (Y= : 
1—|y| 


We can now define a new distance on R as 
d(x, x") = |g(x) — g(x’)]. 


It can indeed be verified that it satisfies the four properties characterizing a distance. 
Let us denote by B(xo, e) the open ball for this new distance centered at x9, with 


3.5 The Extended Real Line 73 
radius p > 0, Le., 

B(xo, p) = {x € R: |x) — g(a)! < p}. 
We claim that the neighborhoods of any point x9 € R remain the same as those 


provided by the usual distance on R. Indeed, since ¢ is continuous at x9, for every 
p > O, there exists a 2 > O such that 


Ix-—xol<p2 => |g(x)— G(xo)| < a1, 
1.€., 
1xo — p2,x0 + pol © B(xo, pi). 


Conversely, since go! is continuous at yo = g(xo) €] — 1, 1[, for every p; > 0 


there exists a p2 > 0 such that 

ly-yol<p2 = yel—1,1f and lg '(y)-¢ Oo <p. 
In particular, taking y = g(x), 

lox) — p(x) < p2 => g(x) €]—1,1[ and |[x— xo] < pi, 
1.€., 

B(xo, 02) € lxo — p1, x0 + pil. 
We have thus proved our claim. - 
Let us now introduce a new set, R, defined by adding to R two new elements 
denoted by —oo and +0o«, 1.e., 
R = RU {—o0, +00}. 
The set R is totally ordered, maintaining the usual order on the reals while setting 
—0oo<x<-+oo foreveryx eR. 
Let us define the function @ : Ro {[—1, l] as 
-1 if x = —oo, 


P(x) =) v(x) ifxeR, 
1 ifx = +00. 


74 3 Limits 


1 


It is invertible, with inverse g* : [—1, 1] > R given by 


~oo ify =—1, 


@'o)={e oO) ifyel-1L1f, 
+00 ify=1. 


We now define, for every x, x’ € R, 
d(x, x') = |G(x) — 6@')|. 


It is readily verified that d is a distance on R, so that R is now a metric space. Let 
us see, for example, what a ball centered at +00 looks like: 


B(+00, p) = {x ER: |x) —1| < p} = {x ER: G(x) > 1- p}, 


hence 
R igre 
B(+00, p) = } ]— ~&, +00] ifp=2, 
ly! —p),+0o] ifp <2, 


where we have used the notation 
Ja, +00] = {x € R: x > a} =]Ja, tof Uf+oo}. 


We can thus state that a neighborhood of +00 is a set that contains, besides +-00 
itself, an interval of the type Ja, +oo[ for some a € R. 

Analogously, a neighborhood of —oo is a set containing —oo and an interval of 
the type ] — ov, B[ for some f € R. 

Let us see how the definition of limit translates in some cases where the new 
elements —oo and +00 appear. 

To start with, let f : E — F bea function with E C R, whose codomain F is 
any metric space. Considering E a subset of R, we have that +oo is a cluster point 
of E if and only if E is not bounded from above. In that case, 


lim f(x)=leF &_ VV neighbourhood of / 
mya dU neighbourhood of +00: f(UNE)CV 
= Ve>O dJaeER: x>a => d(f(x),) <e. 


x 


Similarly, if E is not bounded from below, 


lim f(@)=le F = Ve>O ABER: x< 8B S d(f(x),) <e. 


3.5 The Extended Real Line 75 
Notice that 
lim f(x)=l <¢ lim f(-x)=1. 
X—>—0O X—>-+00 
Let us now consider a function f : E — R, or f : E \ {xo} — R, where E is 
any metric space and xg is a cluster point of E. If we consider the codomain F = R 


a subset of R, then 


lim f(x) =+00@ _ VV neighbourhood of + oo 
XX 
. 4U neighbourhood of x9: f(U \ {xo}) CV 
= VaeR W>0: O<d(x,x0) <b => fQx)>a. 


Similarly, 

lim f(x) = —oo = VWBER 3W6>0: O<d(x,x) <6 > fQax)<FB. 
xX>Xx0 

Notice that 


lim f(x)=-co © lim[—f(x)]=+0o. 
x—>xXx0 


x>x0 


The foregoing situations can be combined together. For example, if E C Ris not 
bounded from above and F = R is considered a subset of R, then 


lim f(x) =+00<@ VV neighborhood of + co 
x—>-+00 


dU neighborhood of +00: f(UNE)CV 
= VaeR Fo’ eER: x>a = f(x)>a, 


and 

lim, £@) = —00 © VBER Fo’ eR: xo’ => f(x) <8. 
On the other hand, if E C R is not bounded from below, 

dim f(@) = +00 & VaeR Fp’eR: x<f' > fQ)>a, 
and 


lim. f(x) = —00 © VBER AER: x<f’ > fQ@)<B. 


An important particular situation is encountered when dealing with a sequence 
(ay)n in a metric space F'. We are thus given a function f : N — F defined as 


76 3 Limits 


f() = ay. Considering N a subset of R, it is readily seen that the only cluster 
point of N is +-oo, and, adapting the definition of limit to this case, we can write 


lim a,=leF & We>O ANWeN: nen S dq@,) <eE. 


n—> +00 
As a particular case we may have F = R, considered as a subset of R, and we thus 


recover the preceding definitions when / = —oo or/ = +00. 
The limit of a sequence will often be denoted simply by lim ay, tacitly implying 
n 


thatn > +00. 
3.6 Some Operations with —oo and +00 
When the limits are —oo or +00, the normal operations with limits cannot be used. 
We will provide here a few useful rules for some of these cases. In what follows, all 
the functions will be defined either on the whole metric space E or on E \ {xo}, and 
Xo will always be assumed to be a cluster point of EF. Let us start with the sum of 
two functions. 
Theorem 3.17 /f 
lim f(x) = +00 
x>x0 
and there exists ay € R such that 
g(x)>y, forevery x 4 xo ina neighborhood of xo , 
then 


lim [f@) + g@)1 = too. 


Proof Leta € R be fixed. Defining @ = a — y, there exists a6 > 0 such that 
0<d(x,x0) <6 => f(x) >a@. 
Hence, 
0<d(x,x9) <6 > f@x)+gx) >at+y=a, 


thereby proving the result. | 


3.6 Some Operations with —co and +00 77 


Corollary 3.18 If 
a f(x) =+00 and im g() =/eER (orl=+oo), 
then 
en 1) + g(x)] = +00. 
Proof If the limit of g is some/ € R, then there exists a 6 > 0 such that 
0<d(x,x0) <6 => g(x) =/l-1. 
On the other hand, if the limit of g is +00, then we can find a 6 > 0 such that 
0 <d(x,x9) <6 => g(x) = 0. 
In any case, the previous theorem can be applied to obtain the conclusion. | 
As a mnemonic rule, we will briefly write 


(+00) +1=-+00 if] is areal number; 


(+00) + (+00) = +00. 


In perfect analogy, we can state a theorem, with a related corollary, in the case 
where the limit of f is —oo. As a mnemonic rule, we will then write 


(—oo) +1 =—co if] is areal number; 


(—oo) + (—co) = —00. 


Regarding the product of two functions, we have the following theorem. 
Theorem 3.19 /f 
lim f(x) = +00 
x>X0 
and there exists a y > 0 such that 


g(x)>y,  forevery x 4 xo ina neighbourhood of xo , 


78 3 Limits 


then 
lim [f @)g@)] = +00. 


Proof Let a € R be fixed. We may assume with no loss of generality that a > 0. 
Setting @ = a there exists a 5 > O such that 


0<d(x,x9) <6 => f(x) >a@. 
Hence, 
0 <d(x,x%0) <6 > f(x)g(x) > ay =a, 
thereby proving the statement. | 
Corollary 3.20 If 
lim f(x)=+00 and lim g(x) =/1>0 (orl =+00), 
x>X0 xX 
then 
lim [f@)g@)] = +00. 
Proof If the limit of g is areal number / > 0, then there exists a 5 > 0 such that 
0<d(x,x9) <6 > g(x) = 7 
On the other hand, if the limit of g is +00, then there is ad > O such that 
0 <d(x,x9) <6 > g(x) = 1. 
In any case, the previous theorem provides the conclusion. | 


In the same spirit, we will briefly write 


(+o0)-1=-+00 if! > Ois areal number; 


(+00) - (+00) = +00, 
with all the following variants: 


(+o0)-1=-—-—oco if! < Ois areal number; 


(—co)-1=—-—co if! > Ois areal number; 


3.6 Some Operations with —oco and +00 79 


(—co)-1=-+00 if! < Ois areal number; 
(+00) - (—00) = —00; 


(—00) - (—00) = +00. 
Let us now analyze the reciprocal of a function. We have two theorems. 
Theorem 3.21 /f 
Jim | f(@) = +00, 
then 
aS 
Proof Let ¢ > 0 be fixed. Setting a = 4 , there exists a 5 > O such that 


0 <d(x,x0) <5 > [f(@x)| >a. 


Hence, 


1 
<—-=6, 
a 


1 1 
0 <d(r30) <5 > | |= 


—0 
f(x) If @)| 


thereby proving the claim. | 
Theorem 3.22 If 


lim f(x) =0 
X—>Xx0 


and 
f(x) >0 — forevery x # xo in a neighborhood of xo , 
then 
li : + 
im —— = +00 
x>XxQ F(x) 
However, if 


f(x) <0 — forevery x # xo in a neighborhood of xo , 


80 3 Limits 


then 


lim : = 
xX>XQ f (x) 


Proof We treat the first case, the second one being similar. Let a € R be fixed; 
we can assume without loss of generality that a > 0. Setting e = :, there exists a 
5 > O such that 

0 <d(x,x0) <6 => O< f(®) <e. 


Then 


1 1 
0<d(x,x0) <6 > ——>-=a, 
f(x) 


and the proof is completed. | 


Finally, we present two useful variants of the Squeeze Theorem 3.10 in the case 
where the limit is +-oo, where only one comparison function will be needed. 


Theorem 3.23 Let F, be such that 

lim F\(x) = +00. 

x>XQ 
If 

f(x) =Fi(x) for every x # xo in a neighborhood of xo , 
then 
lim f(x) =+00. 

xX>XQ 
Proof Setting g(x) = f(x) — Fi(x), we have g(x) > O for every x in a 
neighborhood of x9 and f(x) = Fi(x) + g(x). The result then follows directly 
from Theorem 3.17. | 

In the case where the limit is —oo, we have the following analogous result. 


Theorem 3.24 Let Fy be such that 


lim F2(x) = —oo. 
x—>xXx0 


3.6 Some Operations with —co and +00 81 


If 
f(x) < Fax), for every x ¥ xo in a neighbourhood of xo , 
then 


lim f(x) = —oo. 
xXx—>x0 


We will now deal with some elementary situations when x approaches either -+-oo 
or —0Oo. 


1. Let us first consider the function 
SOSH 


where n is an integer. It can be verified by induction that for every n > 1, 


Since clearly lim x = +00, as a consequence of the preceding theorems we 
X—> +00 


have 


If we then take into account that 
(—x)" = x" if n is even, (—x)" = —x" ifn is odd, 
we also conclude that 
+oo ifn> 1iseven, 
—oo ifn> lisodd, 
X—>—00 1 ifn =0, 
0 ifn <—1. 


2. Let us consider the polynomial function 


f () = nx" Han—-ix" | +--+ anx* +a1x+a0, 


82 3 Limits 


where n > | anda, 4 0. Writing 


F(x) =x" (ay + 


Qn-1 a2 a\ a0 ) 


and using the fact that 


fim (Gn + + + + + DS) =a, 
Xx XxX 


x—-Fo0o n xn-l xn 
we see that 
; +oo ifa,>0, 
lim f(x) = is 
x—>+00 —oo ifa, <0, 
whereas 


fe FOS +oo if either [n is even and a, > 0] or [ is odd and a, < 0], 
X—>—00 ~ |-oo if either [n is even and a, < 0] or [n is odd and a, > 0). 


3. Consider now the rational function 


Anx” + dn—x") +e + anx? + ax +.a9 


PONS ahy fp ga Daye EBs 


where n,m > 1 anda, 4 0, bm # 0. As previously, writing 


an-1 a a a 
an+ = cei we An a ae 


n 
f@=x"™ * = 
bee b b bo ” 
bin = : Se eo =a a 


we can conclude that 


+oo ifn >manda,, by have the same sign, 


—oo ifn > mand apy, bm have opposite signs, 


lim x)= lim “x™™=2fa . 
xX—>+00 FC ) x>+00 Dy uals ifn=m, 
bm 
0 ifn<m. 
In a similar way, once it is observed that 
p : an f 
lim f(x%)= lim —x"”, 
x—>—00 X—>—00 Din 


the limit can be computed in all the different cases. 


3.7 Limits of Monotone Functions 83 


3.7 Limits of Monotone Functions 


We will now see how the monotonicity of a function makes it possible to establish 
the existence of left or right limits. 
Let E be a subset of R, and let xo be a cluster point of EN ]xo, +o00[. 


Theorem 3.25 If f : EM |xo, +o0[ > R is increasing, then 


lim, f(x) = inf f(EN xo, +001). 


Xx XQ 
On the other hand, if f is decreasing, then 


lim, f(x) = sup f(EN ]xo0, +oo[). 


XX 


Proof We prove only the first statement, since the proof of the second one is 
analogous. Set ¢ = inf f(EM]xo, +oo[). If it happens that? € R, then we fix 
an € > 0. By the properties of the infimum, there exists a y € f(EM]xo, +oo[) 
such that y < t+ «. Then, taking x € EM ]xo, +00[ satisfying f(x) = y and using 
the fact that f is increasing, we have 


xo<x<x > U< f(x) < f(%) <tt+e, 
thereby completing the proof in this case. 
If it happens that 1 = —oo, then we fix a 8 € R. Then, since f(EM]xo, +oo[) 
is unbounded from below, there exists ax € EM ]xo, +o0[ satisfying f(x) < f. 


Using the fact that f is increasing, we have 


xo<x<x > f(x) <f@ <6, 


thereby proving that lim f(x) = —ow. a 
x>Xxg 
Notice that the previous statement also includes the case where x9 = —ov, 


provided that E is unbounded from below. 
We now state the analogous result by assuming that x9 is a cluster point of EN 
] — Oo, Xol * 


Theorem 3.26 Jf f : EN] — 0, xo[— R is increasing, then 


lim f(x) = sup f(EN]— 09, xol). 


XX 


84 3 Limits 


On the other hand, if f is decreasing, then 


lim f(x) = inf f(EN] — 00, x0l). 


X—> Xo 
Proof Defining g = —f, we are led back to the previous theorem, and the 
conclusion rapidly follows. | 


The previous statement also includes the case where x9 = +00, provided that 
E is unbounded from above. This happens, e.g., for real sequences, leading to the 
following corollary. 


Corollary 3.27 Every monotone sequence of real numbers has a limit. 
Proof If (ay)n is increasing, then lim, ay, = sup{a, : n € N}, and this limit can be 
either a real number or +00. Similarly, if (ay), is decreasing, then the limit will be 


either a real number or —oo. |_| 


As an example, consider the sequence (a,), defined for n > | by the formula 


| n 
an = (1 + ~) : 
n 
Let us prove that it is increasing: 


1 n+l 
dng _ (1+ 47) 


1 n 

maar rs) 

_ n+2 ce n ae ee 
~\n+l n+1 n 


o n2 +2n PN gage 
~\an+1)2 


ay eee Tat 
~ (n + 1)2 n’ 


so that, by the Bernoulli inequality, 


an+1 -1 n+1 
ania | le Yeas | 
a > (1404 a) : 


n 


We have thus shown that a, < dj+1 for every n > 1; hence, (a,), is increasing. 


3.7 Limits of Monotone Functions 85 


Let us now consider the sequence (b;,), defined forn > 1 by 


1 n+1 
ie (: + ~) 
n 


Let us prove that (b,), is decreasing: 


1 n+l 
by (1+ 3) 
Fog) = feng NERD. 
bn (1+4) 


n WAN fe ee 
n+1 n n+2 
_ nn (@tipy 
~ ntl \n2+2n 
7 1 n+2 
= 1 ——— 
—_( -s) 


n 1 
— {1 2) ——— ]) = 1, 
( aaa aa 


showing that b, > by, for every n > 1| (once again we have used the Bernoulli 
inequality). Since 0 < ay < by, for every n > 1, both the sequences (ay), and (bn)n 
have a finite positive limit. Additionally, since 


li b 1 
pone = tim = tim (1 + =) ey, 
limy dp n Ay n n 


we can conclude that lim, a, = lim, b,. This is a real number, and it is called either 
“Euler’s number” or “Napier’s constant’; it is denoted by the letter e. We can thus 


write 
1 n 
e=lim (: + ~) ; 
n n 
It can be proved that it is an irrational number: 


e = 2.71828... 


86 


3.8 Limits for Exponentials and Logarithms 


First of all, we want to prove that, when x varies in R, 


1 x 
lim (: + ~) =e. 
x—>-+00 x 
For every x > 0, let n(x) be a (unique) natural number such that 
n(x) <x <n(x)+1, 


i.e., the “integer part” of x. Then, for x > 1, 


1 n(x) 1\2) 
1 + —— 1+- < 
(1+) <(143) : 
x 
<(1++) < 
x 


1 n(x)+1 1 n(x)+1 
<{1l+- <(1+—— : 
x n(x) 


Since lim n(x) = +00, we have 
X— +00 


1 n(x)+1 1 n+1 
lim {1+— =limj1+-— 
x—>-+00 n(x) n n 


and 


By the Squeeze Theorem 3.10, identity (3.2) follows. 


3 Limits 


(3.2) 


3.8 Limits for Exponentials and Logarithms 87 


We now prove that also 


1 x 
lim (1+) =— (3.3) 
X—>—00 x 


Indeed, using the formula lim f(x) = lim f(— x), we have 
x——0oo X—> +00 


1\* 1\* 1 ‘a 
lim (1 + =) = lim (1 - =) = lim (1 + —) 
x—>—00 x x—>+00 x x—>-+00 x-—1 


1\>*1 1\” 1 
= lim (1+-) = lim (1+-) (1+2)=e1=6, 
y>+00 y y>+00 y y 


We are now in a position to establish the following important result involving the 
exponential and the logarithm functions. 


Theorem 3.28 We have 


li 


x0 


it 1 *_ ] 1 
m log, +) = log,,(e) ; lim = — : 
x x30 Xx log, (e) 


Proof By (3.2) and the continuity of log, we have 


ym Cee +s) _ |, ; i 1 i 4 v 1%? pee 
im —_——_— = 1m Oo ae im 10 _ = 10 
ee x yee” Sa y y +00 Sa y Eqle), 


and by (3.3) the same is true for the left limit. Moreover, 


i a* —1 i y 1 1 
im = lim ——— = —W____ = —___,, 
x30 Xx y>0 log, (1 + y) ian log, + y) log, (e) 
y>0 y 
thereby completing the proof. a 


Notice that the choice a = e considerably simplifies the preceding formulas: We 
have 


] 1 ae | 
lim log. (1+ x) =1, lim : 
x0 x x0 x 


= 1. 


That is why we will almost always choose as the base of the exponential and the 
logarithm the Euler number e, which is the “natural base.” We will write exp(x) 
(or even exp x) instead of exp, (x) and In(x) (or even In x) instead of log, (x). The 


88 3 Limits 


following formulas may be useful: 


Inx 
Ina~ 


a’ ins etna i log, (x) = 


Also, for the hyperbolic functions the base e will always be preferred, and we will 
write cosh(x) (or even coshx) instead of coshe(x) and sinh(x) (or even sinh x) 


instead of sinh (x). 
The following identities hold: 


. sinhx .  coshx — 1 1 _  tanhx 
lim te ite lim ——.—— = 3° lim = 


x>-0 X x0 x2 


Let us prove, for example, the first one: 


. et-—e* ly... e-Il . er] 
lim ————— = =( lim + lim ) 
x0 2x 2\x30 x x >0 —Xx 
(14h -) Eich =si 
SS 1m — = 
y>0 y 2 


Let us now concentrate on the behavior of the exponential and the logarithm at 
+oo. Using the properties of monotonicity and surjectivity of exp, : R > ]0, +oo[, 
we see that 


lim a* = 
x—>+00 


+oo ifa>l, 
0 if0<a<l, 


while 


+oo ifa>l, 


lim | = 
ger ee) fe if0<a<1. 


Writing x* = exp(a In x), we see that 


+oo ifa>0, 
lim x*®=?1 ifa=0, 
X—> +00 g 

0 ifa <0. 


In the following theorem, we compare the growth of e*, x%, and In x at +00. 


Theorem 3.29 For every a > 0 we have 


ex 


. . nx 
lim —=-+0, lim —=0. 
x—>+00 x” x—>+00 x@ 


3.8 Limits for Exponentials and Logarithms 


Proof Let us start proving that, ifa > 1, 
n 
lim — = +00. 
non 


Indeed, writing a = 1+ b, with b > 0, we see that, ifn > 2, 


—] 4 
a = (1+ by=14nb4 Oo py. pon MO ) p2. 
Hence, for every n > 2, 
_ uae 
n 


89 


whence the result, by Theorem 3.23. Let us now show that for every integer k > 1, 


a” 


lim — = +00. 
nn 


Indeed, writing 


oe) 
nk ~ n ~ n , 
(</a)" 


n n 
We now assume that x > 1. Let n(x) and n(q@) be natural numbers such that 


we can use the fact that lim = +00 in order to arrive at the conclusion. 


n(x) <x <an(x)4+1, n(a)<a<n(a)+l. 


Setting k = n(a) + 1, we have 


e et ex ex et) 
2 eo ES Se = Se eo 
xe ~ xnt+l xk ~ (n(x) + 1D 7 (n(x) + Dé 
Moreover, 
elt (x) ; e” 1 ; ett 1 ; e” 
lim ————- = — lim ————- = - lim —- = +00, 


we k = n k n k a m k 
x—+00 (n(x) + 1) (n+ 1) e (n+ 1) e m 


and the first identity follows. 


90 3 Limits 


Now, by the change of variables “y = In x,” we obtain 


Inx y Wy eNG e \* 
lim —s= lim = lim = lm { —— =0, 
xX—+00 x? y>too (e¥)% y—>+00 ey y—>+00 yl/o 
thereby also proving the second identity. | 


We have thus seen that the exponential function e* grows at +-oo faster than any 
power x*. We now show that the factorial grows still faster. 


Theorem 3.30 For everya € R, 
n 


lim—=0. 
n ni 


Proof If |a| < 1, then lim, a” = 0, whence the result. Let us now assume |a| > 1 
and prove by induction that for every n > n(|a|), 


ja/?—"la) < pl, 


Indeed, this is surely true for n = n(|a|). On the other hand, if the inequality is true 
for some n > n(|a|) then, since |a| <n +1, 


Jat honed = fay" a] < nl lal <nlatl s+), 


so that the inequality is also true for n + 1. 
Now, forn > n({a|) + 1, 


lal” jal ted qi ied Z (n — 1)! a|!+"deb 7 |a|!+7(lal) 


’ 


n! n! n! n 


and the result follows. |_| 


3.9 Liminf and Limsup 


Let (a,)n be a sequence of real numbers. For every couple of natural numbers n, € 
we define 


Ane = min{dn, An+1s-++5 an+e} ’ Bune = max{dn, An+1,-++5 An+e} : 


3.9 Liminf and Limsup 91 


If we keep n fixed, the sequence (@,,¢)¢ is decreasing, and the sequence (Bn,¢)¢ is 
increasing, so the following limits exist: 


ay = lim on,¢ = inf{dn, dn+i,..-}, By = lim Bn, = sup{dn, dn41,...}. 


Notice that @, could be equal to —oo, and Ba could be equal to +00. Moreover, 
Qn <a, < Bn for everyn. 


Now the sequence (@;,), either is constantly equal to —oo or has real values and is 
increasing; similarly, the sequence (f,,), either is constantly equal to +00 or has 
real values and is decreasing. We can then define the “lower limit” and the “upper 
limit” of (ay)y as 


lim inf a, = lima, , lim supa, = lim B,, . 
n n n n 


Let us see how the lower limit can be characterized. We have three cases: 


(i) Ve>O AneN: nen >Sa>l-e 


liminfa, =leRs a : ? 
n (ii) Ve > O, ayn < €+ 6 for infinite values of n . 


liminfa, =—co © VB ER, a, < B for infinite values of n. 
n 


liminfa,=+00 © VaeR AneN: nen>ava>ra. 
n 


Notice that this last case is equivalent to lim, a, = +00. 
Analogously, for the upper limit we also have three cases: 


(i) Ve>O AnEeN: nenaBda<lt+e 


li =feER 
ee aA = | (ii) Ve > 0, a, > €—€ for infinite values of n . 


lim supd, = +00 & Va eR, a, >a for infinite values of n. 


n 


limsupa, =—co & VBER AneN: nena <Ff. 


n 


This last case is equivalent to lim, a, = —o0. 

The advantage of considering the lower and upper limits is that they always exist, 
while the limit, as we know, could not exist. The following theorem explains this 
situation better. 


92 3 Limits 


Theorem 3.31 The sequence (an)n has a limit (possibly equal to either —co or 


+00) if and only if liminfa, = lim sup ap ; in that case, this value coincides with 
n n 
lim dy. 
n 


Proof It is a direct consequence of the foregoing characterizations. We avoid the 
details for brevity’s sake. | 


The following property will be useful. 


Proposition 3.32 Let (an)n be a sequence of positive real numbers. Then 


Qn+1 


Pe toa ; . GQn+1 
lim inf < liminf 4/a, < lim sup ¥/a, < lim sup ey 
n n n n a 


an n 


Proof Let us prove the last inequality. Let 2 = lim sup, If £ = +00, then there 


n 
is nothing to be proved. Thus, assume £ < +00, and notice that surely £ > 0. Let 
€ > 0 be fixed. Then there exists a7 € N such that 
an+1 é 


£+nx. 
ee, 


n>n => 


Asa consequence, 


€ é\3 
adn+3 < (¢ + 5 a2 < (¢ + =) an, 


and thus it can be proved by induction that 


>nmt+1 5 (¢+5) 
nn an < = —_ 
= i 2/ (+5) 


Since for every a > 0 we have lim, 7/a = 1, there exists an > 7+ 1 such that 


oo € an 
> Yan <(l =) J bie, 
ae ee mm <(l+5 Cop 


We have thus proved that lim sup x/a, < @. 


n 
The first inequality can be proved similarly. | 


® | 
| Check for | 


updates 


In this chapter we discover some more subtle properties of the set of real numbers. 
This investigation will emphasize two important concepts, which will then be 
analyzed in the general setting of metric spaces: compactness and completeness. 


4.1 Some Preliminaries on Sequences 


Let U be a subset of a metric space E. Let us recall that a point xo is an “adherent 
point” of U if for every p > 0 one has that B(xo, e) NU ¥ @. On the other hand, 
xo is a “cluster point” for U if for every p > O one has that B(xo, 0) NU contains 
infinitely many elements of U. 

We can characterize the notion of “adherent point” by making use of sequences. 


Proposition 4.1 An element x of E is an adherent point of U if and only if there 
exists a sequence (dn)n in U such that limy ay = x. 


Proof If x is an adherent point of U, then for every n € N the intersection 
B(x, =) M U is nonempty, and we can select one of its elements, calling it ay. 
In such a way, we have constructed a sequence (a;,), in U, and it is now a simple 
task to verify that lim, a, = x. 

Assume now that there exists a sequence (a,),, in U such that lim, a, = x. Then, 


for any p > 0, there exists an € N such that 
n>n => an € Bix, p). 
Hence, B(x, p) MU is nonempty, proving that x is an adherent point of U. a 


Let us now consider two metric spaces E and F anda function f : E > F. We 
want to characterize the continuity of f at a point x9 € E by the use of sequences. 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 93 
A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978- 3-03 1-23713-3_4 


94 4 Compactness and Completeness 


Proposition 4.2 The function f is continuous at xo if and only if, for any sequence 
(dn)n in E, 


lim ap =x => lim f(an) = f (xo). 


Proof Assume that f is continuous at xo, and let (a), be a sequence in E such that 
lim, ad) = xo. By Theorem 3.13 on the limit of the composition of functions, 


lim f (@n) = f(liman) = f (x0), 


so that one of the two implications is proved. 

Let us now assume that f is not continuous at x9. Then there is ane > 0 
such that, for every 6 > 0, there exists an x € E such that d(x,xo) < 6 and 
d(f (x), f(xo)) = e. Taking 6 = tft every n € N there exists an a, in E such 


n+1’ 
that d(an,x0) < <ty and d(f (an), f (xo) = &. Then lim, ay = xo, but surely it 
cannot be that lim, f(a,) = f (xo). The proof is thus completed. Is 


As an immediate corollary, we have the following characterization of the limit, 
assuming xg to be a cluster point. 


Proposition 4.3 We have that lim f(x) =1 if and only if, for any sequence (dy)n 
x>Xx0 
in E \ {xo}, 


lima, =x9 => lim f(a,)=1. 
n n 


Given any sequence (d,)n, we define a “subsequence” by selecting a strictly 
increasing sequence of indices (mx)x and considering the composition 


kt nk a, - 
We will denote by (ay,)x such a subsequence. 
Notice that, since the indices ng are in N and ng+1 > nx, it must be that ng+1 > 


ng +1. As a consequence, one proves by induction that ng > k, for every k, whence 


limny = +00. 
k 


Proposition 4.4 If a sequence has a limit, then all its subsequences must have the 
same limit. 


4.2. Compact Sets 95 


Proof Indeed, by the Change of Variables Formula (3.1), 


lim ay, = lim a,= lim a, 
k—+00 n— lim ng n— +00 
k—++00 
thereby proving the result. | 


Theorem 4.5 Any sequence of real numbers has a monotone subsequence. 


Proof Let (ay)n be a sequence in R. We say that 7 is a “lookout point” for the 
sequence if aj > ad, for every n > n. Now we distinguish three cases. 


Case 1. There are infinitely many lookout points; let us order them in a strictly 
increasing sequence of indices (nj; )x. Then the subsequence (dp, )x is decreasing. 
Case 2. There are only finitely many lookout points. Let N be the largest one, 
and choose ng > N. Then, since no is not a lookout point, there exists n} > no 
such that ay, > Gyo. By induction, we construct a strictly increasing sequence 
of indices (nx)x in this way: Once n; has been defined, since it is not a lookout 
point, there exists angi, > nx such that dy,,, > dn,. The subsequence (ay, )x 
thus constructed is strictly increasing. 
Case 3. There are no lookout points. In this case, choose ng arbitrarily, and 
proceed as in Case 2. 
a 


4.2 Compact Sets 
Here is a fundamental property of closed and bounded intervals. 


Theorem 4.6 (Bolzano—Weierstrass Theorem) Every sequence (an)n in [a, b] 
has a subsequence (ay, )x having a limit in [a, b]. 


Proof By Theorem 4.5, there is a monotone subsequence (ap, )x, Which by Corol- 
lary 3.27 has a limit lim; ay, = /. Since a < ay, < b for every k, by Corollary 3.8 
it must be that a </ < b, thereby proving the result. a 


In a metric space FE, we will say that a subset U is “compact” if every sequence 
(Gn)n in U has a subsequence (dy, )x having a limit in U. 

Bolzano—Weierstrass Theorem 4.6 thus states that if E = R, then the intervals 
of the type U = [a, b] are compact sets. In what follows, a subset of a metric space 
will be said to be “bounded” whenever it is contained in a ball. 


Theorem 4.7 Every compact subset of E is closed and bounded. 


96 4 Compactness and Completeness 


Proof Assume that U © E is compact. Taking x € U, by Proposition 4.1, there 
is a sequence (d,), in U such that lim, a, = x. Since U is compact, there exists a 
subsequence (dp, )x having a limit in U. But, since it is a subsequence, lim, dp, = x, 
and hence x € U. We have thus shown that every adherent point of U belongs to U; 
hence, U is closed. 

Now fix some xo € U arbitrarily. We will show that ifn € N is sufficiently large, 
then U C B(xo, n). By contradiction, if this is false, then we can build a sequence 
(ay)n in U such that d(ay, xo) => n for everyn € N. Since U is compact, there exists 
a subsequence (dn, )x having a limit x € U. Using the triangle inequality, 


ld (an, x0) — A(X, X0)| < dn, x), 
whence lim, d(an,, x0) = d(x, xo), whereas it should be 


lim d (n,. x0) = +00, 


a contradiction. Therefore, U must be bounded. | 
Let us focus our attention now on the compact subsets of RY, with N > 1. 
Theorem 4.8 A subset of R% is compact if and only if it is closed and bounded. 


Proof We already know that every compact set is closed and bounded. Assume now 
that U is a closed and bounded subset of R%. For simplicity, we will assume that 
N = 2. Then U is contained in a rectangle J = [a,b] x [c,d]. Let (@n)n be a 
sequence in U. Then a, = (a, a>); with a} € [a,b] and ae € [c,d]. By the 
Bolzano—Weierstrass Theorem 4.6, the sequence (a))n has a subsequence (an dk 
having a limit /; € [a, b]. Let us now consider the sequence (a?) k, With the same 
indices nz as the one we just found; it is a subsequence of (ns By Bolzano- 
Weierstrass Theorem 4.6, the sequence (an, dk has a subsequence an, ); having a 


limit J) € [c, d]. By Theorem 3.6, 


1 


: F . 2 
lim Qn, = (lima, , lima;, ) = (, 12). 
y: J J J J 


By Proposition 4.1, ! = (i, /2) is an adherent point of U. Since U is closed, L is 
necessarily an element of U. | 


The following property of compact sets will be useful. 


Theorem 4.9 Let U C R® be a compact set. If (Aj)jez is a family (not necessarily 
a countable family) of open sets such that 


ucUAi. 


ieL 


4.3. Compactness and Continuity 97 


then there exists a finite subfamily (Al, ..., A”) of (Aj)jez such that 
UC A!lU---UA", 


Proof For simplicity, we assume N = 2. Let us first prove the statement in the case 
where U is a closed rectangle, and let us denote it by Ro = [ao, bo] x [co, do]. By 
contradiction, assume that there is an open covering (A;)j-z of Ro without finite 
subcoverings. We split the rectangle Ro into four smaller equal closed rectangles, 
connecting the midpoints of its sides. Among these four rectangles, there is at least 
one for which there is no finite subfamily of (A;);-z covering it. Let us call it R}. We 
now proceed recursively and construct in this way a sequence of closed rectangles 
Ry = (ax, be] x [cx, de] such that 


Ro DR} DR2D--- DRED Rv, D-..-, 


for each of which there is no finite subfamily of (A;);-z covering it. By the Cantor 
Theorem 1.9, there exist x belonging to all intervals [az, b;] and y belonging to all 
intervals [cx, dx], so that (x, y) € Rx for every k € N. Since (x, y) belongs to U, 
there is at least one A; containing it. This set A; is open, and the dimensions of 
R; tend to zero as k tends to +00. Then, for k sufficiently large, the rectangle Rx 
will be entirely contained in A;. But this is a contradiction, since there is no finite 
subfamily of (A;);-z covering Rx. 

Now let U be any closed and bounded subset of R?. Then U is contained in a 
rectangle [a, b] x [c, d]. If (Aj)jez is an open covering of U, then 


[a,b] x [c,d] ¢ (LJ Ai) UR \ U). 


ieL 


Since R? \ U is open, we now have an open covering of [a, b] x [c, d], and by the 
first part of the proof, there is a finite subfamily (A! 0A) OF (Aj)jez such that 


[a, b] x [c,d] € (ALU--- UA") U (RB? \U). 
Consequently, U C A! U---U A”, and the proof is thus completed. | 
Note The preceding theorem indeed holds in any metric space, and it can be shown 
that the stated property is necessary and sufficient for the compactness of a set U. 
4.3. Compactness and Continuity 
In what follows, we will say that a function f : A — R is “bounded from above” 


(or “bounded from below” or “bounded”) if that is its image f(A). We will say 
that “f has a maximum” (or “f has a minimum’) if f(A) does. In the case where 


98 4 Compactness and Completeness 


f has a maximum or a minimum, we will call “maximum point” any x for which 
Jf (x) = max f(A) and “minimum point” any x for which f(x) = min f(A). 


Theorem 4.10 (Weierstrass Theorem) /f U is a compact set and f :U > Ris 
a continuous function, then f has a maximum and a minimum. 


Proof Let s = sup f(U). We will prove that there is a maximum point, i.e., a 
x € U, such that f(x) =s. 

We first note that there is a sequence (y,), in f(U) such that lim, y, = s. Indeed, 
if s € R, then for every n > 1 we can find a y, € f(U) such that s — ‘ <\n <S5 
and if s = +00, then for every n there is a y,» € f(U) such that y, > n. In both 
cases, we have lim, y;, = s. 

Correspondingly, we can find a sequence (x,), in U such that f(x») = yp. Since 
U is compact, there exists a subsequence (x,,)x having a limit x € U. Because 
lim Yn = s and yn, = f(Xn,), the subsequence (y,,)x also has the same limit s. 


Then, by the continuity of f, 

{@= f (im xn,) — lim f On) = lim Yu =s. 
The theorem is thus proved in what concerns the existence of the maximum. To deal 
with the minimum, either one proceeds analogously or one considers the continuous 
function g = —f and uses the fact that g has a maximum. a 


The following theorem holds for a general metric space F. 


Theorem 4.11 /fU is acompact set and f : U — F is a continuous function, then 
fC) is a compact set. 


Proof Let (yn)n be a sequence in f(U). We can then find a sequence (x,), in 
U such that f(x,) = y, for every n € N. Since U is compact, there exists a 
subsequence (Xp, )x that has a limit x € U. Recalling that y,, = f(xn,) and that f 
is continuous, 


lim f Gn) = Fim xn) = f). 
Therefore, the subsequence (yy, )x has a limit, precisely f(x), in f(U). i 
We now introduce the concept of “uniform continuity.” First, recall the meaning 
of f : E — F is “continuous.” This means that f is continuous at every point 


xo € E, Le., 


Vx9 € FE Ve > 0 36 >0: Vx EE d(x,x9) <6 => d(f(x), f(xo)) <eé. 


44 Complete Metric Spaces 99 


Notice that, in general, the choice of 5 depends on both ¢ and x. We will say that f 
is “uniformly continuous” whenever such a 6 does not depend on x09, i.e., 


Ve>0 46>0: Vane E Vx EE d(x,x9) <6 => d(f(x), f(xo)) <é. 


The following theorem states that continuity implies uniform continuity when 
the domain is a compact set. 


Theorem 4.12 (Heine Theorem) Jf U is a compact set and f : U > Fisa 
continuous function, then f is uniformly continuous. 


Proof By contradiction, assume that f is not uniformly continuous, i.e., 
de >0: Vd > 0 Axge E Axe E: d(x,xo) <6 and d(f(x), f(xo)) =e. 
1 


Let us fix such an ¢ > 0, and choose 6 = =, with n € N. Correspondingly, there 


are x° and x, in U such that 


dan. 28) < and d(f Gt), Fx) > 8. 


We thus have two sequences, (x,)y and Go in U. Since U is compact, there exists 


a subsequence (x,,)x having a limit x € U. Let us now consider the subsequence 
(x) ks with the same indices nx as the one we just found. Since d(xn,, me tends 


to zero, this subsequence (x) Je has the same limit x. By the continuity of /, 
lim fim) = f(%) and lim f(xy.) = fH), 
implying that 
limd(f Gin), f Qn) = 0, 


in contradiction to the fact that d(f (xn,), f(x0)) >e> 0, foreveryk € N. | 


4.4 Complete Metric Spaces 
We will now introduce the concept of “completeness” for a metric space E. To this 
end, we first need to introduce a special class of sequences. We will say that (dy) 


is a “Cauchy sequence” in E if 


Ve>O dn: [m=>n andn>n]| => d(am,an) <€. 


100 4 Compactness and Completeness 


The metric space E will be said to be “complete” if every Cauchy sequence has a 
limit in E. 

It is readily seen that if (a,), has a limit / € E, then it is a Cauchy sequence. 
Indeed, for any fixed ¢ > 0, taking m and n large enough, we have 


d(am, an) < d(am,1) + d(l, an) < 2e. 


In contrast, a Cauchy sequence in FE might not have a limit in the space EZ. As an 
n 
example, take Q with the usual distance and the sequence a, = (1 + +) whose 


limit is e ¢ Q. Indeed, Q is not complete, whereas R is, as we will now prove. 
Theorem 4.13 R is complete. 
Proof Let (an)n be a Cauchy sequence in R. By definition (taking ¢ = 1), there 
exists an, such that, for every m > n, andn > nj, we have d(ay, dm) < 1. Taking 
m = ny and setting a = ag, — 1, b = ag, + 1, we thus see that the sequence 
(Gn)n>n, iS contained in the interval [a, b]. By Bolzano—Weierstrass Theorem 4.6, 
there exists a subsequence (dn, )x having a limit / € [a, b]. We now want to prove 
that 
lima, =/. 
n 

Let e > 0 be fixed. Since (a,),y is a Cauchy sequence, 

dn: m>n andn>n => d(an,an) <€. 
Moreover, since lim; dn, = / and limy nx = +00, 

3k: k>k > dy) <e€ and m>h. 
Then for every n > n, 

d(ayn, 1) < d(an, nz) + d(dn;, I)<e+e=28, 
thereby completing the proof. | 
We now extend the previous theorem to higher dimensions. 


Theorem 4.14 RY is complete. 


Proof For simplicity, we assume N = 2. Let (ay) be a Cauchy sequence in R?. 
We write each vector a, € R? in its coordinates 


an = (Gn,1; Gn,2) : 


44 Complete Metric Spaces 101 
Since 


|Am,1 — Gn,1\| < |lan — am||, |dn,2 = Gn,2| < |lan — aml, 


we see that both (ay,1), and (@n,2)n are Cauchy sequences in R. Hence, since R is 
complete, each of them has a limit 


lim dy) = ly ER, limay2 =l, ER. 
n n 
Then 


lima, = (limay,1, limdy,2) = (h, /2), 
n n n 


which is an element of R?. |_| 


A normed vector space that is complete with respect to the distance given by 
its norm is said to be a “Banach space.” We have thus proved that R™ is a Banach 
space. 

The following theorem provides us the Cauchy criterion for functions. 


Theorem 4.15 Let F be a complete metric space. Then a function f : E — F has 
alimit lim f (x) in F if and only if the following property holds: 
Xx>Xx0 


Ve > 055 > 0:[0 < d(x, x0) <6 and 0 < d(x’, x9) <6] => d(f(x), f(x’) <e. 


Proof Assume that limy-.x) f(x) = / € F. The conclusion then follows from the 
definition of limit and the triangle inequality 


d(f (x), f@)) sdf), ) +d(f(x'),D. 


On the other hand, if the property stated in the theorem holds, take a sequence 
(ay), Such that lim, a, = xg. Then we see that (f(a,))n is a Cauchy sequence, 
so it has a limit / € F. To see that this limit does not depend on the sequence, 
let (a/,), be another sequence such that lim, aj, = xo. By the foregoing property, 
for every ¢ > 0 it will be d(f(a,), f(a,)) < e for n sufficiently large, from 
what follows that (f(a/,)), has the same limit as (f(a,)),. Having thus proved 
that lim, f(a,) = 1 for any sequence (a,),, such that lim, a, = xo, the conclusion 
follows from Proposition 4.3. i 


102 4 Compactness and Completeness 


4.5 Completeness and Continuity 


The following theorem provides a useful extension property for uniformly continu- 
ous functions. We say that a set is “dense” in E if its closure coincides with E. 


Theorem 4.16 Let E be adense subset of E, and let F be a complete metric space. 
If f: E > Fis uniformly continuous, then there exists a unique continuous 
function f : E — F whose restriction to E coincides with f. 


Proof Taking x € E, there exists a sequence (x,)p, in E such that limy Xn = Xx. 
Since f is uniformly continuous and (x), 1s a Cauchy sequence, it follows that 
( f (Xn))n 18 also a Cauchy sequence. Hence, since F is complete, it has a limit 
y € F. We define f (x) = lim, f (xp) = y. 

Let us verify that this is a good definition. If (¥,), is another sequence in E such 
that lim, x, = x, then lim, d(xn, X,) = 0, and since f is uniformly continuous, then 
also lim, d(f (xn), f &n)) = 0. Hence, (Ff (En))n necessarily has the same limit y 
of ( f (Xn))n, and the definition is consistent. 

Clearly, the function f thus defined extends a since, if x € U, we can take 
the sequence (x,), as being constantly equal to x. Let us now prove that f is 
(uniformly) continuous. Once ¢ > 0 has been fixed, let 6 > 0 be such that, taking 
u,veE, 


d(u,v)<28 => d(f(u), f(vr)) < 


wo 


If x, y are two points in E such that d(x, Ss < 6, then we can take two sequences 
(Xn)n and (yp)n in E such that lim, X, = x and lim, y, = y. Then, since 
lim, 7 (xn) = f(x) and lim, 2 (yn) = fQ), for all sificiently! large n it will be 
that d(xn, yn) < 26 and 


d(f(x), fO)) <d(f@), fen) + dfn), fOn) +46 On), fO) 


<- -= 
ae 


which proves that f is uniformly continuous. 
To conclude the proof, let f : E — F be any continuous function extending f. 
Then, for every x € E, taking a sequence (x,), in E such that lim, x, = x, 


f(x) = lim fn) = lim fm) = fx). 


We have thus proved that f is the only possible continuous extension of f to FE. @ 


4.6 Spaces of Continuous Functions 103 


4.6 Spaces of Continuous Functions 


Let E and F be two metric spaces. We consider a sequence of functions f, : E > 
F, and we want to examine, whenever it exists, the limit 


lim fn (x). 


Clearly enough, this limit could exist for some x € E and not exist at all for others. 
So assume that, for some subset U C E, there is a function f : U — F for which 


lim fn(x) = f(x), foreveryx eU. 
n 


In this case we will say that the sequence (f;,)n “converges pointwise” to f on U; 
it thus happens that 


Vx €U Ve>O AMEN: nen |S d(fn(x), f()) <e. 


If the preceding choice of n does not depend on x € U, we will say that the 
sequence (f,)n “converges uniformly” to f on U; in this case, 


Ve>0O AnEN: Vx €U nen => d(fnlx), f(X)) <e, 
i.e., equivalently, 
lim | suptd(fu(2), f@):x€ vu] =0. 


Let us provide an example of a sequence (f,,), which converges pointwise, but 
not uniformly. Let f, : [0, 1] > R be defined for n > 1 as 


nx if0<x<?, 
fine) =42—-nx ifi<x<2, 
0 if2<x<1 


It is easily seen that lim, f,(x) = 0 for every x € [0, 1], but the convergence is not 
uniform, since fn(4) = | foreveryn > 1. 

The uniform convergence has good behavior with respect to continuity, as the 
following theorem states. 


Theorem 4.17 Jf each function f, : E — F is continuous on U C E and (fn)n 
converges uniformly to f on U, then f is also continuous on U. 


104 4 Compactness and Completeness 


Proof Consider an arbitrary point x9 of U; we will show that f is continuous at xo. 
Let ¢ > 0 be fixed. Then there exists a € N such that 


Vx EU nen => d(frlx), f(x) <5. 


Since 


d(f (x), f(x0)) S ACF), fn) + dfn), fn(%0)) + d(fn(x0), FX) 5 
taking n = n, we have that 
2 
d(f (x), f(%0)) < aa A(fa(x), fa(%o)) - 


By the continuity of fj at xo, we can find a 6 > O such that 
1 
d(x,x0) <8 = d(fa(x), fa(%o)) < re 
Hence, 
2 1 
d(x,x0)<6 = d(f (x), f(xo)) < gg cr 


thereby proving that f is continuous at xo. i 


Note When (/f,,)n is a sequence of continuous functions that converges uniformly 
on U to some function, for every x9 € U we can write 


lim ( lim In(x)) = lim (sim In(x)) 
n xX—>xXx0 x—>x0 n 

In what follows, we say that a function f : E — F is “bounded” if its image 
f(E) is bounded, and we denote by B(E, F) the set of all those functions. We 
define, in B(E, F), 


doo (f, 8) = sup{d( f(x), g(x)) ix € E}. 
It is readily verified that this is a distance; hence, B(E, F) will now be treated as a 
metric space. Taking a sequence (f;,), and a function f in the space B(E, F’), we 


have 


lim fn= f <=  (fn)n converges uniformly to f on E. 
n 


Let us now investigate some further properties of this space of functions. 


4.6 Spaces of Continuous Functions 105 


Theorem 4.18 /f F is complete, then B(E, F) is also complete. 

Proof Let (fn)n be a Cauchy sequence in B(E, F’). Since, for every x € E, 
dfn), fr(X)) S doo(fms fn). 

we have that (f,(x))n is a Cauchy sequence in F for each x € E. Since F is 

complete, the sequence (f;,(x)), has a limit in F; we will denote it by f(x). In this 

way, we have indeed defined a function f : E — F, and we will now prove that 

lim, fro = f in B(E, F). Let e > 0 be fixed. Since (f,)n is a Cauchy sequence, 


there exists an € N such that 


[m>nandn >n] => do(fn, fn) < € 
=> d(fn(x), fn(x)) < ¢, foreveryx ec E. 


By the continuity of the distance, we have 
lim d (fn), fim(%)) = dfn), FO), 
whence 
n>n => d(fn(x), f()) <e, foreveryxe€ E. 

Hence, f belongs to B(E, F), and 

n>Nn = doo(fns f) Se. 
The statement is thus proved. | 

If f €¢ B(E, F) and F is a normed vector space, we can define 
II Flloo = sup{ll f@) ||: x € E}. 


One easily verifies that this is indeed a norm on B(E, F) and that 


doo(f, 8) = If — 8lloo - 


As an immediate consequence of the preceding theorem, we have the following 
corollary. 


Corollary 4.19 If F is a Banach space, then B(E, F) is also a Banach space. 


Proof Since F is complete with respect to the distance induced by its norm, 
B(E, F) is also complete, by Theorem 4.18. a 


106 4 Compactness and Completeness 


Let us denote by C(E, F) the set of continuous functions f : E — F. We are 
now interested in considering the space C(E, F) N B(E, F), made up of bounded 
and continuous functions. 


Theorem 4.20 The setC(E, F) 1 B(E, F) is closed in B(E, F). 


Proof Let f € B(E,F) be an adherent point of C(E, F) 9 B(E, F). By 
Proposition 4.1, there exists a sequence (f,), in C(E, F)  B(E, F) such that 
lim, fr = f. By Theorem 4.17, f is continuous, since it is the uniform limit of 
continuous functions, hence f € C(E, F)N B(E, F). We have thus proved that the 
closure of C(E, F) ON B(E, F) coincides with C(E, F) 1 B(E, F) itself. a 


The set C(E, F) N B(E, F) inherits the distance dx from B(E, F) and, when 
F is anormed vector space, also its norm || - ||o9. The following corollaries will be 
useful. 


Corollary 4.21 If F is complete, thenC(E, F)N B(E, F) is also complete. Hence, 
if F is a Banach space, then C(E, F) 1 B(E, F) is also a Banach space. 


Proof Any Cauchy sequence in C(E, F) N B(E, F) has a limit in B(E, F), since 
B(E, F) is complete. Since C(E, F) N B(E, F) is closed in B(E, F), this limit 
belongs to C(E, F) 1 B(E, F). Hence, C(E, F) N B(E, F) is complete. a 


Corollary 4.22 If E is compact and F is complete, then C(E, F) is also complete. 
Hence, if E is compact and F is a Banach space, then C(E, F) is also a Banach 
space. 


Proof Since E is compact, every f € C(E, F) is bounded, hence C(E, F) coin- 
cides with C(E, F) 0 B(E, F). The conclusion follows by the previous corollary. 
a 


® 


Check for 


Exponential and Circular Functions 


The aim of this chapter is to provide a unified construction of the exponential and 
the trigonometric functions using geometrical arguments in the complex plane. The 
basis of this construction will lie in the proof of the following statement. 


Theorem 5.1 Let ¢ be a nonzero complex number, with RE > 0 and 3¢ > 0, and 
let t be a positive real number. There exists a unique continuous function f : R > 
C \ {0} with the following properties: 


(4) fO=LfM=6. 
(b) fi + x2) = f(r1) f@2), for every x1,x2 ER. 
(c) Rf (x) = Oand 3 f (x) = 0, for every x € [0, T]. 


Its proof will be developed in the next section. We will first learn how to measure 
the length of an arc of the unit circle. This will also lead to the definition of the 
number z. Then the function f will be defined on some dense subset of R in the 
most natural way, indeed the only possible one. Finally, it will be extended on the 
whole real line. 

So let us move forward this plan. 


5.1. The Construction 
It is not restrictive to assume t = 1. Indeed, if we denote by f; : R > C \ {0} the 


function we are looking for, once we have found f; : R > C \ {0}, it is sufficient 
to define 


felx) = fi(=). 


so that all the requirements are satisfied. 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 107 
A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978- 3-03 1-23713-3_5 


3 


108 5 Exponential and Circular Functions 


We will denote the first quadrant of the complex plane by 
QO) = {zEeC: R(x) = 0, S(z) = O}. 

The proof of Theorem 5.1 is divided into the following three subsections, as 
explained earlier. 
5.1.1 Preliminaries for the Proof 
We provide here a rigorous definition of the argument of the complex number ¢, 
passing through some sequences that have a simple geometric interpretation in the 
complex plane. 

Let (On)n be the sequence of complex numbers 

On =Xn tiyn € D1, 
such that 
09 =, and a4 =o,, foreveryneN. 
More explicitly, p41 = (Xn+1, Yn+1), With 
Xn + /x2 + y? eet Yn 
5) —= 3 H+ FF Ow _.- 
2 (Xn + /x2 + yz) 


Notice that x, > 0 for every n > 1. It is easily seen, by induction, that 


Xntl = 


ge =¢, foreveryneN. (5.1) 


Now let (G;,)n be defined as 


on 


On 


lon| ; 


These are unit vectors, i.e., |O,| = 1, and, setting 


it is easily seen that 


o=0, and Gi Ons foreverynéeN. 


5.1 The Construction 109 


The two sequences (o,,), and (6,), are drawn in Fig.5.1 in the case 3(¢) 4 0 (the 
following pictures will also be drawn for this case). On the other hand, if 3(¢) = 0, 
then o, = | for everyn € N. 

Let us now define the two sequences (£;), and (Ly), as 


2en 


They represent the length of the segments depicted in Fig. 5.2. 
Observe now that, for every m, 


ln =|0n — II, Ly = 


lant _ 


n Gn | = lon" (Gn — DI = Gn!" [on — 1] = |Gn — 1] = fn, 


so that the points 


~ ~2 ~3 ~4 
1, On, O71, O% 
all lie on the unit circle S' = {z € C: |z| = 1}, and @, is the distance from each 


one of them to the next one. This fact can be visualized in Fig. 5.3. 


Fig. 5.1 The definition of o, c = 00 
and G, 1 : 


Fig. 5.2 The definition of £,, 
and Ly 


110 5 Exponential and Circular Functions 


Fig. 5.3 The equidistant g 1 
points 05 


Fig. 5.4 A geometric view a 
: n 
of €n41 in terms of £,, 


Let us prove that (Fig. 5.4) 


qs 4k, (5.2) 


? = 6n0,-, for every n € N, we have 


Indeed, since | = |o, 


ss ss e . . 1 7 1 
6 = (Gn - (Gn — D* = Gr -DGF-D) =(Gq—-D (+ = i) =2-On- =, 
On On 
hence 
2 1 
2-0.) = (Ge + ) =24+6,,+— 
n+ n+l 
n On 


and formula (5.2) directly follows. 


5.1. The Construction 111 


It can now be noted that, since 69 belongs to sin Q), it must be that 29 = 
|59 — 1| < V2 and, by formula (5.2), 


0<l,< J2, foreveryn Ee N. (5.3) 
We finally define the two sequences (ay), and (by)y as 
ayn = 2" 2, , by = 2" Ly 


Note that if 3(¢) = 0, then £, = 0, hence a, = b, = 0 for every n € N. 
Let us now concentrate on the case 3(¢) > O. In this case, 


hence a, < by, foreveryn € N. 
Let us see that the sequence (a,,), is strictly increasing; by (5.2), 


Gng1 7 2-/4- fn —_ 
an Ln a ae i) 4— 2+ Je 8 


hence ay < dn41, for everyn € N. 
On the other hand, let us prove that the sequence (b,), is strictly decreasing; 
by (5.2) again, 


NIe 
| 

~ 
bo 

<= 

+ 


bn+1 


24+ /4-@ 


NIle 


1 2 
~ | + 1 
(aes 


1 
-d+)H=1 
i ie 


hence b, > bn+1 for everyn €N. 


112 5 Exponential and Circular Functions 


Thus, the sequences (a,), and (by), are monotone, so they both have a finite 
limit. Since, then, 


; - an 
lim £2, = lim — =0, 
n n Qn 


we have 


- = lim — = lim ——— = 1, 
limy an n Ay n /q— g2 


so we can conclude that the two sequences do indeed have the same limit. We call 
this real number argument of ¢ and denote it by Arg(¢). We can thus write 


Arg(¢) = lim2”|6, — 1]. 
n 


In this way, we have rigorously defined the “length” of the arc on the unitary circle 
S! starting from (1,0) and arriving at € = ¢/|¢|, moving in counterclockwise 
direction. 

It may surprise the reader that such an intuitive notion has required so much 
work! However, the precise definition of the length of a curve will only be given 
later on in this book and requires some deeper analytical tools (Chap. 11). 

We are now ready to introduce an important number in mathematics, the number 
a, pronounced “pie,” defined as 


mw = 2Arg(i) = 3.14159... 
The importance of this number z will emerge later on. It measures twice the length 
of the arc on the unitary circle S! starting from (1, 0) and arriving at (0, 1), moving 
in a counterclockwise direction, so half the length of S! itself. It can be proved that 


it is an irrational number. 
In the case where 3(¢) = 0, i.e., when ¢ is a positive real number, we set 


Arg(¢) =0. 


In what follows, we will require the inequality 
‘, 1 
lo, —1| < on Arg(¢), foreveryn Ee N, (5.4) 


which is a direct consequence of the fact that (ay), is increasing. 


5.1. The Construction 113 


5.1.2 Definition on a Dense Set 


We first define the function f on the set 


E={|~:meZnen}, 
Qn 


which is a dense subset of R. We will see that if we want a function f : E > C\ {0} 
to satisfy the conditions (a), (b), and (c) of the statement, then its definition is 
uniquely determined. 

Thus, assume that (a), (b), and (c) hold for some function f : E > C \ {0}. 
Since f(1) = ¢ = 00, by (b), 


w= nn=(58)=1(0)(-F OP 


and since ns (4) > 0, s/(4) > O, it must be that 1(3) = 0}. Similarly, since 


v=r(=s(68) (=F) 


we see that f (4) = 02. Iterating this process, we see that we must set 


1 


(=) =o,, foreveryneN. 


Moreover, 


Cs) =s(ma)=[e)]- 


This shows that if (a), (b), and (c) hold, then the definition of f on the set E must 


be 
mM | =o" 5.5 
(Fr) = on ©») 


nm 


114 5 Exponential and Circular Functions 


Indeed, if, for instance, n’ > n, then we see by (5.1) that o, = oO, aie hence 


,. / 

nn na—n / 
m m m2' m 

oO, =(o7 J" =a,, = O57 


Let us now prove that the function f : E — C \ {0} defined by (5.5) satisfies the 


properties (a), (b), and (c). Notice that f(0) = 1 and f(1) = ©, so that property 
(a) holds. Let us prove that 


f(x + x2) = fri) f(%2), for every x1, x2 € E, (5.6) 


k 
which is property (b) on the domain FE. Taking x; = on and x2 = = (we can now 


choose the same denominator), we have 


k m\ _ k+m Kn © op k m 
(at) = s(t) en = ober = rae) (Se) 


Finally, with the aim of verifying property (c), we claim that 


1,0n, a, ae e. belong to Q;, foreverynéeN. 
This is surely true if n = 0 or 1. If n = 2, then we have that 1, 02 and oe = 0 


surely belong to Q1, as well as Gs = ¢. Concerning 53, we notice that 
~3 ~3 ~3 ~ 
lo5|=1 and loz —03| = |o5 —on|= 2. 


In principle, two points satisfy these properties, one in the first quadrant and one in 
the third. However, since £2 < J2, it must be that a3 belongs to Q;. By induction, 
the same argument can be used to prove the claim for every n € N. 

Thus, we have constructed a function f : E — C\ {0} that verifies the properties 
(a), (b), and (c) on its domain. And this is the only possible function with these 
properties. 


5.1.3. Extension to the Whole Real Line 


To extend the function f : E — C \ {0} defined by (5.5) to the whole real line R, 
we will apply Theorem 4.16. To this end, we first need to verify that f is uniformly 
continuous on any bounded subset of its domain FE’. We fix a real number R > 0 and 
consider the restriction of f to EM[—R, R]. 

We define the two functions g : E > ]0,+0o[ andh: E > S! by 


f(x) 
Racolm 


g(x) =f @)I, h(x) = 


5.1. The Construction 115 


and we remark that 

g(x1 +. x2) = g(x) g(x2) for every x1,x2 € E (5.7) 
and 

h(x, + x2) = h(x )h(x2) for every x1,x2 € E. (5.8) 

Let us first concentrate on the function g and prove that it is uniformly continuous 
on E 1 [—R, R]. We need some preliminary considerations. 

It is easily seen that if |¢| = 1, then g is constant. Assume now that |¢| > 1. In 
this case, it can be seen that |o,| > 1 for every n € N, so also |o/"| > 1 for every 
néNandm > l,i-e., 

g(x) >1, foreveryx € EN]0,+o00[. 
Consequently, 
X1< X20 => B(x2) = g(x) 842 — x1) > 8(%1), 
proving that g is strictly increasing. Let us now show that 


lim g(x)=1. 
BK ) 


Fix ane > 0. Let € N be such that n > (|¢| — 1)/e. Then, for every n > n, using 
the Bernoulli inequality, 


Lyi n 
l(a) =|) <lt+ne<1+2"e < (1+), 


so that 


1 


Since g is increasing, this proves the claim. 
We are now ready to prove that g is uniformly continuous on E 1[—R, R]. Let 
us fix ¢ > 0. By the above considerations, there exists 6 > 0 such that 


€ 
O<x<6 => I<g(x*) <1+—. 
g(R) 


116 5 Exponential and Circular Functions 
Then, taking x}, x2 € EM[—R, R] such that x1 < x2 and x2 — x1 < 6, we have 


0 < g(x) — g(x) = g(a1)(g02 — 1) — 1) < g(R)—— = 
a(R) 


This proves that g is uniformly continuous on E 1 [—R, R], with values in the 
interval [g(—R), g(R)]. 

If |¢| < 1, then the proof is similar (but g is strictly decreasing in this case). 

We now concentrate on the function h. Notice that 


k 
Take xj = — and x2 = as with k < m. Then 
Qn Qn 


= 6G, —D(+6,+67+--- +6"), 


and hence, by (5.4), 


() -*(3e) 


This proves that / is uniformly continuous on the whole domain E. 
The restriction of the function f on EM[—R, R] is uniformly continuous, since, 
when x1, x2 € EN[-R, R], 


= m 
< lon — Ulm — &) < Argo) — 55). 


lf 2) — Fal = lg@r2)h(e2) — g Qh)! 
S |g(x2) — gr1)| |A@2)| + [h@2) — h@DI Ig @v)I 
S |g(x2) — 8x1) + |A(x2) — A(x1)| max{g(—R), g(R)}, 


and both g and / are uniformly continuous. This restriction of the function f takes 
its values in the compact set 


F={zeCin <|z| <r}, 


where 7; = g(—R) and rz = g(R), or vice versa. Since F is a closed subset of 
the complete metric space C, it is a complete metric space itself (with Euclidean 
distance). Hence, by Theorem 4.16, f can be extended in a unique way to a 
continuous function on [—R, R], with values in the same set F’. Since this can be 
done for an arbitrary R > 0, we have thus defined a continuous extension of f on 
the whole real axis R, with nonzero values. And this is the only possible continuous 


5.2 Exponential and Circular Functions 117 


extension. We will still denote by f this new function. We now verify properties 
(a), (b), and (c) for this function. 

Recalling that t = 1, property (a) was already verified earlier since f(0) = 1 
and f(1) =¢. 

Concerning property (b), take x1, x2 in R, and let (41,.n)n and (%2,n)n be two 
sequences in E such that lim, x1,, = x; and lim, x2,, = x2. Then, from (5.6), by 
continuity, 


paces + x2) = Fdim C1 ,n + x2,n)) = lim FS (Q1,n + x2,n) = lim Sf 1.) f (x2,n) 


= lim fin) tim fn) = fOr) fC). 


Finally, to verify property (c), take x € [0, 1], and let (x,), be a sequence in 
E (0, 1] such that lim, x, = x. Since f(x,) belongs to Q1, which is a closed set, 
then, by continuity, 


F(x) = lim fn) € Q1. 


We have thus constructed a continuous function f : R — C \ {0} that verifies 
the properties (a), (b), and (c). And this is the only possible one. Then the proof of 
Theorem 5.1 is complete. 


5.2. _ Exponential and Circular Functions 


In this section we define the exponential and circular functions, providing a proof 
for the previously stated Theorems 2.15 and 2.16. 


Proof of Theorem 2.15 If t = 1 and is a positive real number, say, = a > 0, 
then the function h is constantly equal to | and f coincides with g : R > ]0, +o0[. 
This is the exponential with base a, 1.e., the function fg : R + R4+ whose existence 
was stated in Theorem 2.15. Indeed, property (i) follows from (5.7), and g(1) = a. 
We need to prove that if a ~ 1, then g : R > J0, +00[ is invertible. 

As seen previously, when a > 1, the function g is strictly increasing on E, and 
so also on R, whereas if a < 1, then it is strictly decreasing. Hence, if a ~ 1, then 
the function g is injective; let us show that it is also surjective. 

We have shown, after the statement of Theorem 2.15, that g(n) = a” for every 
n € N. Assume, for instance, a > 1; then lim a" = +00, hence, by monotonicity, 


also 


lim g(x) = +00. 
X— +00 


118 5 Exponential and Circular Functions 


On the other hand, since g(x) g(—x) = g(x — x) = g(0) = 1, 
a, B(x) ~ pe St=*) = Fouls g(x) = 


We can then conclude, by Bolzano’s Theorem 2.12, that the image of g is the whole 
interval ]0, +o00o[. 

We have thus proved that if a > 1, then g : R — ]0,+00[ is invertible. The 
same conclusion holds when 0 < a < 1, and the proof is analogous. The proof of 
Theorem 2.15 is thus complete. a 


Proof of Theorem 2.16 Now let ¢ = i and t > 0 be arbitrary. Then the function 
g is constantly equal to 1, so f coincides with h : R — S!. Notice that, since 


h(t) =i, 


h(2t) =h(t +t) =A(t)* =i* =-1, 
hBr) =hQt +1) =hAQt)h(t) = -i, 
h(4t) =hBrt+t)=ACBr)A(t) = 1, 
and then 
h(x +41) =h(x)h(4t) = h(x), foreveryx ER, 
showing that h is a periodic function, with period T = 4t. We would like to prove 
that T is indeed the minimal period of h. 


Since h is continuous and nonconstant, its minimal period is T/k for some 
integer k > 1. Assume by contradiction that k > 2. Then 


and 


=(2)-P@ 


and since h(T/2k) € Qy, it must be that h(7/2k) = 1. Then we will have 


= -b@l 


5.3 Limits for Trigonometric Functions 119 


and since h(T/4k) € Qy, it must be that h(7/4k) = 1, too. Proceeding in this way, 
we see that it must be that 


T 
n(=z) =1, foreveryj EN. 


Hence, also 


T 
"(S) =1, forevery 7 eNandmeZ. 


Since the set {mT /2/k : J € N,m € Z} is dense in R and h is continuous, this 
would imply that h is constantly equal to 1, a contradiction, since h(t) =i. 
We have thus proved that 
T = 4t is the minimal period of h, 
and henceforth we will write hr instead of h. Since (i) follows from (5.8) and 
h(t) =i, the proof of Theorem 2.16 is thus now complete. a 
5.3 _ Limits for Trigonometric Functions 


In the following theorem, the number z enters the picture. 


Theorem 5.2 We have 


_ Ar(x)—1 2a, 
lim ——— = — I 
x—0t x T 


Proof Since T = 4t and 
x 
hac(x) = ha(=), 
z 
it will be equivalent to proving that 


_ Ahax)-loiz, 
lim —— — = —i 
x—0t x 2 


First of all, we show that this is true when x = mm i.e., that 


; a 
lim 2” (o, —1) = ~i. (5.9) 
n 2 


120 5 Exponential and Circular Functions 


(Recall that o,, and 6, coincide in this case.) We already know that 


. . 2 ae 
lim |2"(on — | = Arg) = 5 = Fa ; 


hence, since 3(2”(o, — 1)) > 0, it will be sufficient to show that 


lim 2”%(o, — 1) =0. (5.10) 
n 


Since o,f = 0,7 | we see that 


(on — je ee Ye + On - 2 oF + 1— 20n _ (on — 1)? 
" 7 2 7 2 7 20n 7 20n . 


Recalling (5.4) with ¢ =i, since 6, = o, and Arg(i) = oe we have that 
Sg =i | ar (5.11) 


Hence, 


lon — Te a 
n n 
|2°X(on — 1L)| < 2 onl < anes” 


thereby proving (5.10) and, hence, (5.9). 
We now prove the stated limit when x varies in EF, i.e., when x = oy > 0. In 


such a case, 


ha(x)-1 a, ov —-lo oa, 
——— - ~i]/= —i 
x 2 or 2 
2 -1 
es Dh ih one Oe ee 
m 2 


1 2 ee m—1 
Moy —1)| AeA EA 14 fame, — 9-23 
m 2 
—] 2_4 ae ml _y 
<2" lon yj Raa A ey oO let foten 2 
m 


By (5.11), fork = 1,2,...,m — 1 we have 


fe XW 
o;! — o/ '| = klon — Usk 


Qn+l° 


k k 

: ee 
lox — 1 =| oi -— of) < S 
j=l j=l 


5.3 Limits for Trigonometric Functions 121 


Using the formula 


-1 
(FGFS othe he 


we obtain 


m 
1 TU ce 
—m Ee +2 gn+l at + (m d= 
1 (m—-1)m zim 
= — —— <-—. 
m 2 Qr+l 4 2h 
In conclusion, if x = 4; > 0, then 
ha(x)-1 a, rum r, 
| ee gn Sh) a 
x Z| =o aoe re! 


As x = 4; tends to 0, necessarily n tends to +00, and the result follows by (5.9). 

We finally look for the limit as x > 0+, without further restrictions on x, and 
assume by contradiction that either such a limit does not exist or that it is not equal 
to Fi . Then there is ¢ > 0 and a strictly decreasing sequence (X,)n, with x, > Or, 
such that, for every n, 


h4a(xn) — 1 us j 


>. 
Xn 2 


By the continuity of the function hat and the density of E in R, for every 


sufficiently large n one can find a positive number x), € E such that 


1 hax) -1 oo 
|xn —x,|<— and MS eae >, 
n x 2 
contradicting the previous part of the proof. | 


As a consequence of the preceding theorem, we have the following corollary. 
Corollary 5.3. We have 


. sing(x) 22 
lim —— = — 


x>0 Xx T 


122 5 Exponential and Circular Functions 


Proof Writing hr (x) = cosr(x) +i sinz (x), we have 


hr(x)-1 cosr(x)—1__, sinr(x) 
a yf 


x x x 
Hence, by Theorem 5.2, 
.  cosr(x)—-1 . sinr(x) 20 
lim ——————_ = 0, lim ——— = —. 
x—>0+ x x—>0+ x T 


Now, we have shown, after the statement of Theorem 2.16, that siny is an odd 
function; hence, 


. sinr(x) sing (—x) . sing (x) 20 
lim ——— = lim ——— = lim ——— = —, 
x07 x x—0t —Xx x—0t x T 
and the proof is completed. | 


Notice that the choice T = 27 simplifies the preceding formula. This is why 
we will always choose as the base of the trigonometric functions the number T = 
27. We will write cos(x), sin(x), tan(x) (or simply cos x, sinx, tanx) instead of 
COS27 (X), SiNa, (x), tano, (x). Hence, 

_ sinx 
lim 


x0 Xx 


=1. 


The knowledge of this limit now allows us to prove that 


_  tanx 
lim =1. 
x0. xX 
Indeed, we have 
. tanx . sinx . sinx .. 
lim = lim cosx = lim lim cosx = 1-cos(0)=1. 
x>0 X x>0 X x>-0 xX x->0 
Moreover, we can also prove that 
cosx — | 1 


5.3 Limits for Trigonometric Functions 123 


Indeed, we have 


. cosx—1 . cos? x — 1 
lim ——,— = lim Sy ASS 
x0 x x0 x-(cosx + 1) 

sin? x ; 1 1 1 

=-— lim 5 th ee ee 

x>0 x x>0cosx +1 2 2 


It will be useful to keep in mind these remarkable limits for further applications. 


Part Il 


Differential and Integral Calculus in R 


® 


Check for 
| updates 


We start by introducing the concept of “derivative” of a function defined on a subset 
of R, taking its values in R. 

Let O, a subset of R, be the domain of the function f : O > R, and consider as 
fixed a point x9 € O. For every x € O\ {xo}, we may write the “difference quotient” 


f(x) — f@o) , 


x — x0 


it is precisely the slope of the line passing through the points (xo, f(xo)) and 
(x, f(x). 


Henceforth, x9 will be assumed to be a cluster point of O. 


Definition 6.1 The limit 


i f(x) — fo) 
im ————__., 


x—> x0 x — x0 


whenever it exists, is called the “derivative” of f at xo, and is denoted by one of the 
following symbols: 


/ df 
Fo), Df), Fo). 
x 


We say that f is “differentiable” at x9 when the derivative exists and is a real number 
(hence, not equal to +00 or —oo). In such a case, the line passing through the point 
(xo, f (xo)) and having f’(xq) as its slope, whose equation is 


y = f (xo) + fo) — x0), 
is called the “tangent line” to the graph of f at the point (xo, f(xo)). 
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 127 


A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978-3-031-23713-3_6 


128 6 The Derivative 


Note that, in some cases, the derivative of f at xo could only be a left limit or 
a right limit. Typically this situation arises when O is an interval and xo coincides 
with an endpoint. 
It is sometimes useful to write, equivalently, 
= h) — 
f'G%9) = tim LOA=LEO _ jig, FEO M = FOO) 


X>XO x — x0 h—>0 h 
Example 1 Let f : R > R be defined as f(x) = mx + q. Then 


fai Te (mx + q) — (mxo + q) = 
x>X0 x — x9 


The tangent line in this case coincides with the graph of the function itself. The 
particular case m = 0 tells us that the derivative of a constant function is always 
equal to 0. 


Example 2 Let f(x) = x”; then 


xt x” n—-1 
f' (xo) = lim 0 = lim era.) = ig . 
Xx X0 k= 


X>x0 X — XO 


Let us prove the same formula using a different approach: 
_ Goth xf. Ilr) a 
/ = I 0 — 1 = n k pk _ yt 
PO Tt h 0h d Kk)? #0 


li (o(n —kypk : “\ (n —kyk-1 -1 
= ing (is #) = yim (i) h ) =nsg . 


Example 3 Now let f(x) = e*; then 


; ; exoth — eX ; 7 elt a | 
Ff (xo) = lim ———— = lim e” 
h>0 h h>0 


x0, 


=e 


Example 4 Choosing f (x) = cos x, we have 


cos(xo + h) — cos(xo) 
h 


cos(xg) cos(h) — sin(xg) sin(h) — cos(xo) 


/ — fi 
f (x0) Jn 


= lim 
h>0 h 


6 The Derivative 129 


1 — cos(h) sinceay tite sin(h) 
_— oo XI 
h? GD 


= —cos(xo) lim h 
h->0 
= — sin(xo). 
Example 5 On the other hand, if g(x) = sin x, then 


sin(xo + h) — sin(xo) 
h 
sin(xo) cos(h) + cos(xo) sin(h) — sin(xg) 


! = ij 
g (xo) a 
= lim 
h—-0 h 


1 — cos(h) 4 ea sin(h) 
AD cos(xg jim 


= —sin(xo) im h 
h->0 
= cos(xo) . 
The following theorem provides us a characterization of differentiability. 


Theorem 6.2 The function f is differentiable at xq if and only if there exists a real 
number £ for which one can write 


f(x) = f%o) + €(x — x0) + r(x), (6.1) 
where r is a function such that 


_ r(x) 
lim 
X>X0 X — XO 


=0. (6.2) 


In that case, we have = f'(xo). 
Proof Assume that f is differentiable at x9. Then 


km 227 feo = f' Go) = x0) _ 
mn SX 


x>x0 x — x0 


0. 


Hence, setting r(x) = f(x) — f (xo) — f’(x0)(x — x0), the desired properties (6.1) 
and (6.2) are readily verified, taking £ = f’(xo). 
Conversely, assume that (6.1) and (6.2) hold. Then 


im 22) — FG) — £@ — x0) _ 
1 oo 


x>xQ X — XQ 


0 ’ 


130 6 The Derivative 


and hence 
= = — (x — 
Pee CS eet) ee (& fo) ~ x = 30) :) =, 
x>XxQ xX — x0 X>X0 X — XO 
showing that f is differentiable at x9. | 


We now prove that differentiability implies continuity. 
Theorem 6.3 /f f is differentiable at xo, then f is continuous at xo. 


Proof Since f is differentiable at xo, we have that 


lim f(x) = lim f (x0) + LO) = FRO) = ) 
xX>xXx0 x>x09 X — XO 


f (x0) + f'(xo) -0 = f(xo), 


which means that f is continuous at xo. a 


6.1 Some Differentiation Rules 

Let us review some rules for the computation of the derivative. 

Theorem 6.4 /f f, g: O > Rare differentiable at xo, then so is f + g, and 
(f + 8)’ (x0) = f' (x0) + 80) - 

Proof We compute 


im LTOHO-Sts)00 _ jo [= seo i g0) = s00) 


X> XO X — XO X—> XO Xx — X0 xX — XO 


_ f@)-—fRo) 4. 8) — 8(xo) 
= lim ——— + lim ——— 
x—>x0 x — x0 X>x0 X — XO 
= f'@o) + 8’(x0) , 
and the formula is proved. | 


Theorem 6.5 Jf f, g : O — Rare differentiable at xo, then so is f - g, and 


(f - g)'(xo) = f’ 0) g (xo) + f X08" (x0) - 


6.1 Some Differentiation Rules 131 


Proof We can write 


lim (F- 8)) — (F - 8)0) _ ai oe f (Xo) ae a+ rns P= a 
X—>x0 Xx — XO X—>X0 x — — x0 
eas fx) — fo) Ga Jim rewire 80) = 800) 
x XO x — x0 AO X — XO 


and the conclusion follows, recalling that lim f(x) = f (xo), since f is continuous 
x—>x0 


at xo. a 
The particular case where g is constant with a value a € R gives us the formula 
(af) (xo) = af’ (x0). 
Moreover, writing f — g = f + (—1)g, we have that 
(f — 8)'(x0) = f' (x0) — 8’(x0) - 


Theorem 6.6 /f f,g : O —> Rare differentiable at xo and g(x) 4 0, then so is £, 
and 


fY'. _ £0) go) — fG0)8" (0) 
—) @) = = rer 
g [g(x0)] 


Proof Since £ =f. :, it will be useful to first show that 4 = is differentiable at xo. 
Indeed, we have 
1 1 
FA gq) = (Xo) a g(xo) — g(x) - g' (xo) 
rx x — xO XO —AO)E@Eeo) — [e@o)P 
Then 
‘ay vo ‘(0) 
( (10) = f"Coo)2 Oy teen (Ge ey 
g &(X0 [g(xo)] 
whence the conclusion. a 


Example I Let us take into consideration the tangent function 


sin x 
tanx = 


COS X } 


132 6 The Derivative 


Choosing f(x) = sinx and g(x) = cosx, we have! 


f'(x0)g(x0) — f (x0) g(xo) cos” x9 + sin” xo 1 
Dtan x9 = ——— = “x = : 
[g(xo)]? cos? x9 cos? x9 


Example 2. We now compute the derivative of the hyperbolic functions. Let 


Xpee | 1 
cosh(x) = FS = 5 (eS). 


Then 


1 exo e*%0 — e*0 
D cosh(xo) = 5 ( - ) = ———_ = sinh(x0). 


Similarly, writing 


2 e* 
we have 
1 x0 x0, —X0 
Dsinh(xo) = 5 G a —) = oe = cosh(xg). 
Moreover, 
eee cosh(xo9) cosh(x9) — sinh(xg) sinh(xo) _ 1 


cosh? (xo) ~~ cosh? (xo) 
Example 3 All the polynomial functions 


1 


F(x) = anx” + Gp—1x" "+e + anx* + a,x + ao 


are differentiable, with derivative 


n—2 


F' (xo) = nanxe | + (n— l)an-1Xy ~ +--+ +2a2x0 +41. 


2, 


! Here and in what follows we will often write cos? x and sin? x instead of (cos x)? and (sin x), 


respectively. 


6.1 Some Differentiation Rules 133 


Hence, all rational functions of the type 


ree 22. 


q(x) 


with p(x) and q(x) polynomials, are also differentiable at all points x9 where 


q(xo) 4 0. 


Let us now see how to compute the derivative of the composition of two 
functions. Subsequently in this chapter, we will always consider only nondegenerate 
intervals, i.e., those not reduced to a single point. 


Theorem 6.7 /f f : O — R is differentiable at xo, and g : J —> Ris differentiable 
at f (xo), where J is an interval containing f (O), then g o f is differentiable at xo, 
and 


(g 0 f)'o) = 8'(f (x0) fo) - 
Proof Setting yo = f (xo), let R : J > R be the auxiliary function defined as 
g(y) — go) 


Ry) = y- yo 
g’ (yo) if y = yo. 


ify # yo. 


We observe that the function R is continuous at yo and 


g(y) — s(0) = ROY)(y — yo) ~=—foreveryy Ee J. 
Hence, if x 4 xo, 


8(F(%)) = 8(F 0) _ Rf (0) F(x) — fo) 
X—X 


x — x0 


Since f is continuous at xo and R is continuous at yo = f (xo), the function R o f 
is continuous at xo, hence 


fa 8(F(x)) = 8(F G0) _ insRG GD): laa f(x) — f (x0) 
x—> x0 Xo x>x0 X>X0 X — XO 


= R(f (xo)) f’ (x0) = 8’ (f (x0) f’ (0) , 


which is what we wanted to prove. | 


134 6 The Derivative 


Example 1 Leth : R — R be defined as h(x) = cos(e*). Then h = go f, with 
f(x) = e* and g(y) = cosy. For any xo € R, we have that f’(x9) = e*°, and if 
yo = f (xo), then g’(yo) = — sin yo. Therefore, 


h' (xo) = g'(f Xo) fo) = — sin(e”) e*? . 


Example 2. Now let h : R > R be defined as h(x) = e°S*. Thenh = go f, with 
f(x) = cosx and g(y) = e’. For any xo € R, we have that f’(xo) = — sin xq, and 
if yo = f (xo), then g’(yo) = e”. Therefore, 


h' (xo) = 9' Cf (xo) f’ (x0) = e°S*9 (— sin xo) . 


We will now show how to compute the derivative of the inverse of an invertible 
function. 


Theorem 6.8 Let I, J be two intervals, and f : I > J be a continuous invertible 
function. If f is differentiable at x9 and f'(xo) 4 0, then f—! is differentiable at 
yo = f (xo), and 


-ly _ 1 
ee alr ree 


Proof We first observe that, by Theorem 2.14, the function f~! : J > J is 
continuous. Then, by the change of variable formula, 


-1/,) _ ¢-1 = 
ij Oe ee 
y> Yo y—yo xslim f-!(y) fr) — fo) 
y> Yo 
. X — X0 
= in —— 
x>f-!(v0) F(%) — fo) 
ae 1 ol 
= yom L@—FG0) — Fx)’ 
x—X0 
thereby proving the result. | 


Example 1 If f(x) = e*, then f~!(y) = Iny, and, for any yo > 0, writing yo = 
e*°, we have that 


phon = on2— = — 
= *0" Fo) eo" 


6.2 The Derivative Function 135 


Example 2 Let a be any real number, and let h :]0,+o0o[— R be defined by 
h(x) = x®. Since 


we can write h = go f, with f(x) = a Inx and g(y) = e”. Then 


1 1 
h (xo) = 8’ (f (x0) f’ (x0) = ea — = xfa— = axg!. 
x0 x0 


We thus see that the same formula we had found for an exponent n € N also holds 
for any exponenta € R. 


6.2 The Derivative Function 


We will now assume that the function f : J — R is defined over an interval J C R. 
We say that “f is differentiable” if it is differentiable at every point of J. In this case, 
we can associate to every x € J the real number /’(x), thereby defining a function 
f’ : I > R, which is called the “derivative function” or simply “derivative” of f. 
Looking back at our previous examples, we can summarize the derivatives we have 
found in the following table: 


f(x) | f’'@) 
xe axe! 
e* e* 

1 

Inx _ 

x 
cosx | —sinx 


Some care must be taken concerning the domains, of course. 


136 6 The Derivative 


It might be interesting at this point to see whether the derivative function f’ has 
a derivative at some point xo of J. If it does, then we call (f’)/(xo) the “second 
derivative” of f at xo and denote it by one of the following symbols: 


d° f 
f" (x0), D° f(xo), => (0)- 
dx 
It is now possible to proceed by induction and define the nth derivative of f at xo, 
using the notation 


qd” 
f(x), D" f(xo), (a0), 
XxX 


by setting f™ (xo) = (f"~ PY (x0). 

We say that f : J — R is “n times differentiable” if it is so at every point of 
I. If, moreover, the nth derivative f : J — R is continuous, we say that f is of 
class C”. The set of those functions is denoted by C” (J, R) or sometimes by C” (J). 
In this setting, C°(7, R) is just C(/, R), the set of continuous functions. 

If f is of class C” for every n € N, we say that f is “infinitely differentiable.” 
The set of those functions is denoted by C®(/, R) or sometimes by C%(/). For 
example, the exponential function f(x) = e* belongs to this set, since 


D'e =e, foreveryn>1. 


It can be verified that all the functions in the preceding table are infinitely 
differentiable on their domains. 


6.3. Remarkable Properties of the Derivative 


We say that x9 € O is a “local maximum point” for the function f : O > R if 
there exists a neighborhood U of xo for which f(U) has a maximum and f(xo) = 
max f(U). Equivalently, if 


do>0: x€ Bixo,p)NO => f(x) < f(xo). 


A similar definition holds for “local minimum point.” 

We will now compute the derivative of a function f at the local maximum or 
minimum points, provided that they are not at the endpoints of the domain, the 
interval J. 


Theorem 6.9 (Fermat Theorem—I) Let xo be an internal point of I, and assume 
f : I —> Ro be differentiable at xo. If, moreover, xo is a local maximum or 
minimum point for f, then f'(xo) = 0. 


6.3 Remarkable Properties of the Derivative 137 


Proof If xo is an internal local maximum point for f, there exists a p > O such that 
]xo — p, x0 + p[ G I and 


x — x0 <0 ifxpo<x<xo+—. 


fx) — fo) ee ifxp9 —0 <x <X0, 


Since f is differentiable at xo, the limit of the difference quotient exists, and it 
coincides with the left and right limits, i.e., 


f'(xo) = lim f@)— FO) _ i, FO)— FCO) | 


X> XQ xX — XO xoxg xX — X0 
By the foregoing inequalities, as a consequence of sign permanence, 


_ Sf) — fo) 
a 


li >i 
XX xX — XO xoxg x — x0 


mn LOd= FO) . 


Then it must be that f’(xo) = 0. In the case of a local minimum point, one proceeds 
similarly. | 


It is natural that the derivative, as any limit, provides us with local information 
on the behavior of a function. However, the following theorems will open the door 
to the study of the global properties of the graph of a function. 


Theorem 6.10 (Rolle Theorem) /f f : [a,b] — R is a continuous function, 
differentiable on a, b[, and 


f(a) = f(b), 
then there exists a point € € Ja, b[ such that f'(é) = 0. 


Proof If the function is constant, then its derivative is equal to zero at every point, 
and the conclusion trivially follows. Assume, then, that f is not constant. Then there 
exists a x € Ja, b[ such that 


either f(x) < fla)= f(b), or fx) > f@= fd). 


Let us consider the first case. By Weierstrass’ Theorem 4.10, f has a minimum 
in [a, b], and in this case any minimum point cannot be an endpoint of [a, b]; hence, 
it must be in Ja, b[. Let € € Ja, b[ be such a point. By Fermat’s Theorem 6.9, it must 
be that f’(é) = 0. 

The situation is analogous in the second case. By Weierstrass’ Theorem 4.10, f 
has a maximum in [a, b], and in this case any maximum point must be in Ja, b[ . By 
Fermat’s Theorem 6.9, if € € Ja, b[ is such a point, then f’(€) = 0. | 


138 6 The Derivative 


What follows is a generalization of the preceding theorem; it is also known as 
the mean value theorem. 


Theorem 6.11 (Lagrange Theorem) /f f : [a,b] — R is a continuous function, 
differentiable on Ja, b[, then there exists a point € €]a, b[ such that 


f@)-f@_ 


f@=S= 


Proof We define the function g : [a, b] > Ras 


£0) f@) 
a 


ex) = fa) -[s@+ 


(x — a] : 

Clearly g is continuous on [a, b], differentiable on Ja, b[, and such that 
g(a) = 0= gid). 

By Rolle’s Theorem 6.10, there exists a point € € Ja, b[ where 


fO)-f@ _ 


g'(6) = '@) - 0, 


whence the conclusion. |_| 


Corollary 6.12 Let I be an interval and f : I — R a continuous function, 
differentiable on I. The following propositions hold: 


(a) If f'(x) = O for every x € I, then f is increasing. 

(b) If f'(x) > O for every x € I, then f is strictly increasing. 
(c) If f(x) < Ofor every x € I, then f is decreasing. 

(d) If f'(x) < Ofor every x € I, then f is strictly decreasing. 
(e) If f(x) = Ofor every x € I, then f is constant. 


Proof To prove (a), let x1 < x2 in J. By Lagrange’s Theorem 6.11, there exists a 
& € Jx1, x2[ such that 


f (x2) — f(x) 


X42 — X1 


f= 


Hence, since f’(&) > 0, it must be that f(x,;) < f (x2). This proves that f is 
increasing. All the other propositions follow similarly. | 


Remark 6.13 Note that if f is increasing, then every difference quotient for f is 
greater than or equal to zero, and therefore f’(x) > 0 for every x € J. Hence, in 
(a), and the same also in (c) and (e), the implication can be reversed. But this is not 


6.4 Inverses of Trigonometric and Hyperbolic Functions 139 


the case for (b) and (d); indeed, if f is strictly increasing, it is not true in general 
that f’(x) > 0 for every x € I. The derivative could be equal to zero somewhere, 
as the example f(x) = x? shows. 


6.4 _ Inverses of Trigonometric and Hyperbolic Functions 


Recalling the sign properties of the trigonometric functions and that Dcosx = 
—sinx and D sinx = cos x, we have the following properties: 


ane strictly decreasing on [0, zr] , 
strictly increasing on [z, 27r] , 


4 

strictly increasing on| -= 

sin is a 
strictly decreasing on [ 7 =| ‘ 


Let us consider the two functions F : [0,7] — [—1,1] andG : =e 5] => 
[—1, 1] defined by F(x) = cosx and G(x) = sinx. They are strictly monotone, 
hence injective. Moreover, because they are continuous, their image is an interval. 
Since F(z) = —1 = G(—4) and F(O) = 1 = G(4), both images coincide with 
{[—1, 1]. Therefore, the two functions thus defined are DHCCHVE: We will call the 
functions F~! : [—1, 1] — [0, z] and G!: {[—-1,1] > [- 53 alae arccosine” and 
“arcsine,” respectively, and we will write 


F-'(y) =arecosy, G7!(y) =aresiny. 


The first one is strictly decreasing, whereas the second one is strictly increasing. Let 
us compute their derivatives. Setting y = F(x), for x €]0, z[ we have 


1 1 1 1 
F(x) sinx ~—-/1 —cos?x Jl—y2’ 


while setting y = G(x), for x ¢] — 4, $[ we have 


(FUG) = 


1 1 1 1 
G(x) MO. gaa. ae. 


Note that arccos + arcsin, having a derivative always equal to zero, is constant. Since 
its value at 0 is 5 we have that 


GY (yy) = 


W 
arccos y + arcsin y = a for every y € [—1, 1]. 


140 6 The Derivative 


Let us consider now the function H : | — 5 val — R defined as H(x) = tanx. 
Considerations similar to those given previously show that it is invertible. We will 


call the function H~! : R>]— >, FL “arctangent,” and we will write 
H~'(y) = arctany. 
It is strictly increasing, with 


: IU ; a 
lim arctany = ——, lim arctany = —. 
y>—0o 2 — +00 2 


y 
Let us compute its derivative. Setting y = H(x), for x €]— 5 al we have 
1 1 1 
H~')'(y) = —— = cos’ x = ——__ = ——_... 
(1 YO) = FG w= pataee | Laey? 
Let us now switch to the hyperbolic functions. The hyperbolic sine sinh : R > R 


is strictly increasing and invertible. The inverse function can be written explicitly as 


sinh! (y) = In(y +.,/y2 +1). 


The derivative of this function can be computed either directly or using the formula 
for the inverse function. If y = sinh(x), then 


1 1 


1 1 
Dsinh(x) — cosh(x) 4/7 4 sinh (x) aoe ie ye. 


The hyperbolic cosine cosh : R > R is neither injective (being even) nor surjective 
(since coshx > 1| for every x € R). On the other hand, the function F : [0, +oo[ > 
[1, +oo[, defined as F(x) = cosh x, is strictly increasing and invertible. Its inverse 
function F—! : [1, +oof > [0, +oo[ can be written explicitly as 


Dsinh!(y) = 


Fly) =Iny + yy?- 1). 
It is often denoted, by an abuse of notation, by cosh—!. Let us compute its derivative. 
If y = cosh(x), with x > 0, then 
1 1 


1 1 
Dcosh(x)  sinh(x) /cosh’@) —-1  Vy2—1 


The function tanh : R — R is strictly increasing, but it is not surjective, since —1 < 
tanhx < | for every x € R. On the other hand, the function H : R >] - 1, I[, 


D cosh” !(y) = 


6.5 Convexity and Concavity 141 


defined as H(x) = tanh x, is invertible, and its inverse H ee ]-1, 1[—> Ris given 
by 


1/1 
AA Gye = tal), 
2 l-y 


It is often denoted, by an abuse of notation, by tanh—!. Let us compute its derivative. 
If y = tanh(x), then 


1 1 1 
D tanh" !(y) = ——— =cosh*(x) = oe = 
D tanh(x) 1—tanh*(x) I1l-—y 


We can now return to the table of derivatives and enrich it with some of those found 
earlier. 


fo) | £@) fo) | £'e@) 
1 
xe eget arccos x |——_————— 
V1 — x? 
e* e* 1 
arcsin x | ———— 
Inx I AV 1-— x2 
1 
cosx | —sinx arctan x Tae 
sinx | cosx Peer oe 1 
1 Vx2—1 
tan x 5) 

COS* xX pel 1 
cosh x| sinh x paren. [x24 
sinh x | cosh x tanh~! x ! 5 

1 1-—x 


tanh x a 
cosh* x 


6.5 Convexity and Concavity 


As usual, in what follows, J C R will denote a nondegenerate interval. 
We will say that a function f : J > R is “convex” if, taking arbitrarily three 
points x1 < x2 < x3 in J, the following inequality holds: 


f (x2) — fx) - f (x3) — f (x2) 


x2 —X] - x3 — x2 


(a) 


142 6 The Derivative 


Let us show that inequality (a) is equivalent to the following ones: 


f (x2) — f (x1) f (x3) — f(x) 


(b) 
X2—X]{ X3—-X]1 

(c) f (x3) — fai) Z f (x3) — f (x2) 
X3—-X] X3—X2 


Indeed, 
fG2) = fe) _ £3) — fea) 
X2— X] x3 — x2 
<> (f (2) — f1)) 3 — 2) S (F003) — Fr) 2 — 1) 
<> (f (x2) — fr) 3 — x1 + x1 — x2) S (fF 3) — fOr) + FO) — f@2)) 2 — x1) 
<> (f 2) — fr) 3 — 41) S (F003) — FO) G2 — *1) 
vn Fx2) = FO) — F@3) — FR) 


x2 — Xx] ~ x3 -— Xx] 


proving that (a) <> (b). The proof of the equivalence (a) + (c) is analogous. 
We may now observe that f : J — R is convex if and only if, for every xo € J, 
the difference quotient function F' : I \ {xo} — R, defined by 


f(x) — fo) 


x — x0 


F(x)= 


is increasing. Indeed, taking x, x’ in J \ {xo} such that x < x’, we can see that 
F(x) < F(x’) in all three possible cases: x < x’ < x0, or x < x9 < x’, or 
xo <x <x’. 

The following characterization of a convex differentiable function will now be of 
no surprise. 


Theorem 6.14 Jf f : I — R is continuous, differentiable on I, then 
f isconvex << _ f’ is increasing on I. 


Proof Assume that f is convex. Let wa < 6 be two points in l.ifa<x< B, then, 
by (b), we have 


fa) = f@) _ FB) = f@) 


x-a@ B-a 


’ 


6.5 Convexity and Concavity 143 


whence, since f is differentiable at a, 


F'@ = lim JLO7IO@ . OH“ f@ 


x—a B-a 
Analogously, by (c), we have 


f(B) — f(a) _ SA-f&) 
B-«a ~ Bx - 


whence, since f is differentiable at 6, 


x—> Bo B-x B-a 
Then f’(a) < f’(B), showing that f’ is increasing on /. 
Conversely, assume f’ to be increasing on J. Taking x; < x2 < x3 arbitrarily in 
I, by Lagrange’s Theorem 6.11, 


f (x2) — fxr) 


X2— X] 


3& €)xy,xo[: f'(E) = 


and 


fo) f (x2) 


— x2 


A & €]x2,43[: fl (G2) = 


Notice that | < &. Since f’ is increasing on 7, it must be that f') < f’&), 
thereby yielding inequality (a). 


We will say that f is “strictly convex” if, taking arbitrarily three points x; < 
x2 < x3 in J, we have that 


f (x2) — fr) 7 f (x3) — f (x2) 


(a’) 
x2 —Xq x3 — x2 
Equivalently, 
w’) f (x2) — f 1) 2 fx3) — FO 
x2 —- x1 x3 — x4 
or 
a JSel=-TOn.. fas)—/ea) 
(cc) —————_ < ———_. 


X3— X1 X3 — X2 


144 6 The Derivative 


The following characterization also holds true in this case. 
Theorem 6.15 /f f : 1 — R is continuous, differentiable on I, then 
f is strictly convex ©  f’ is strictly increasing on i 
Proof We need to slightly modify the proof of the previous theorem. Assume that 


f is strictly convex, and let a < 6 be two points in L.Ifa<x< (a + B), by (b’) 
we have 


Fx) = f@) — f (42) - F@) _ £B)= f@) 


x-a@ ath _ yy B—a > 

whence 
f@) = tim FO=FO . (GA-F@ _ fH-F@ 
yet x-a _ oP B—a F 


Analogously, if 5 (a + B) <x <8, then, by (c’), we have 


f(B)— fo) _ f(B) = f (42) _ £B)= fH 
p-a p— xe p-x 


’ 


whence 


at+pB 
#(Gy= tin f(B) — fx) S f(B) — f=) ~ [A-F@) 
x—> p- B —x B —_ ate B = 
Then f’(a) < f’(B), thereby proving that ’ is strictly increasing on I. 
Conversely, assume f’ to be strictly increasing on iL Taking x; < x2 < x3 in J, 
by Lagrange’s Theorem 6.11, exactly as in the proof of the previous theorem, we 
obtain inequality (a’). | 


We will say that f is “concave” if the function (— f) is convex or, equivalently, 
the opposite inequality in (a) (or in (b) or in (c)) holds. Analogously, we will say 
that f is “strictly concave” if the function (— f) is strictly convex or, equivalently, 
the opposite inequality in (a’) (or in (b’) or in (c’)) holds. Clearly enough, analogous 
theorems can be written characterizing either the concavity or the strict concavity 
of f when f is differentiable and f’ is either decreasing or strictly decreasing, 
respectively. 

We can now state the following corollary, which is widely applied in practice. 


6.5 Convexity and Concavity 145 


Corollary 6.16 Let I be an interval and f : I — R be a continuous function, 
twice differentiable on I. The following propositions hold: 


(a) If f(x) = 0 for every x € I, then f is convex. 

(b) If f(x) > O for every x € 1, then f is strictly convex. 
(c) If f’ (x) < O for every x € 1, then f is concave. 

(d) If f’ (x) < O for every x € I, then f is strictly concave. 


Proof Let us prove (a). Since f”(x) > 0 for every x € i, by Corollary 6.12, 
the function f’ : J — R is increasing. Hence, by Theorem 6.14, the function 
f : I — Ris convex. The other properties follow similarly. a 


Recalling Remark 6.13, we can observe that in (a) and (c) the implications can 
be reversed. If f is convex, then f(x) > 0 for every x € I, and similarly, if f is 
concave, then f”(x) < 0 for every x € T. But this is not permitted either in (b), as 
the example f(x) = x* shows, or in (d). 


Example I The exponential function f(x) = e* is strictly convex since 
f"@ =e" >0, foreveryx ER. 
Its inverse In(x), the natural logarithm, is strictly concave. 


Example 2 Since D? cosx = —cosx and D? sinx = —sinx, recalling the sign 
properties of these functions, we have that 


; a 1 
strictly concave on [ Sy =| , 
2 2 
cos is 


: a 37 
strictly convex on [= ; =| ; 
2 2 


— strictly concave on [0, 7], 
strictly convex on [7r, 277]. 


The points separating an interval where the function is convex from an interval 
where it is concave are called “‘inflexion points.” For the cosine function, the set of 
inflexion points is {5 +kz : k € Z}, whereas for the sine function it is {kz : k € Z}. 


A similar analysis can be made on all the other elementary functions introduced 
till now. 

The following property of a differentiable convex function can be useful. It states, 
roughly speaking, that its graph always lies above any of its tangents. 


146 6 The Derivative 


Theorem 6.17 If f : 1 — Ris convex and it is differentiable at some point xo € I, 
then 


f(x) = f' (xo) (x — x0) + f(xo), foreveryx eT. 


Proof The inequality surely holds if x = xo. Thus, let us assume x xo. 
If x > xo, taking h > O such that x9 < x9 +h < x, then, by the convexity of f, 
we have that 


fe) — fo) , fo +h) — fo) 


x — x9 ~ h 
Taking the limit as h > 0 we find 


fx) — fo) 


Xx — x0 


=F Ga), 


thereby leading to the inequality we want to prove. 
On the other hand, if x < xo, taking h < 0 such that x < x9 +h < xo, then, by 
the convexity of f, we have that 


f(xo) = FO) _ fo) = fo +h) 


xo —- Xx —h 
1.€., 
F(x) — FGo) . FGoth) — fo) 
x — x0 a h 
Taking the limit as h — 0, the conclusion follows as well. a 


6.6 _L'H6pital’s Rules 
We first need to prove the following generalization of the Lagrange Theorem 6.11. 


Theorem 6.18 (Cauchy Theorem) /f f,g : [a,b] — R are two continuous 
functions, differentiable on ]a, b[, with g'(x) # 0 for every x €]a, b[, then there 
exists a point € € Ja, b[ such that 


f'@) _ fO=-f@ 
v® gb)—8@) | 


6.6 LHédpital’s Rules 147 
Proof We define the function h : [a, b] > Ras 


h(x) = (gb) — g(a) f(x) — (fF) — F@)s). 


It is continuous on [a, b], differentiable on Ja, b[, and such that h(a) = h(b). Then 
Rolle’s Theorem 6.10 guarantees the existence of a point € € Ja, b[ where h'(&) = 0, 
whence the conclusion. 


Notice that Lagrange’s Theorem 6.11 can now be seen as a corollary of Cauchy’s 
theorem by taking g(x) = x. 

In the remainder of the book, it will be convenient to adopt the following notation. 
Whenever a is greater than b, the symbol [a, b] indicates the interval [b, a], and 
Ja, b[ indicates ]b, a[. Note that the statement of Cauchy’s Theorem 6.18 remains 
valid in this case as well. 

The following result is known as “L’H6pital’s rule in the indeterminate case 5.” 


Theorem 6.19 (L’H6pital Theorem—I) Let I be an interval containing a point 
xo, and let f, g : I \ {xo} > R be two differentiable functions, with g'(x) 4 0 for 
every x € I \ {xo}, such that 

lim f(x) = lim g(x) =0. 

xXx>XxQ x>x0 


If the limit 


f'@) 


im 
x40 g/() 


exists, then the limit 


tim 22 
im 


x0 gO) 


also exists, and the two coincide. 


f(x) 
g'(x 
extend the two functions at the point xo by setting f (xo) = g(xo) = 0; in this way, 
f and g will be continuous on the whole interval J. By Cauchy’s Theorem 6.18, for 
every x # xo in J there is a point &, € ]xo, x[ (depending on x) such that 


Proof Set! = lim (allowing the possibility that / = +-oo or —oo). Let us 
xX—>xXx0 


Fé) _ f@)= fo) _ f@) 
E&) 8G) — 8%) -g@) 


148 6 The Derivative 


Notice that lim &, = xg. Then, using the change of variables formula (3.1), 
x—>x09 


ooo age Els ge PO FO: 
im — = lim = Im = lim = 
x>x0 g(x) = x> 40 g/(Ex) p> Himes g'(y)  -y>x0 g’/(y) 


’ 


and the proof is complete. | 


Note that the preceding theorem does not exclude the possibility of xo being an 
endpoint of the interval J, in which case we are dealing with left or right limits. 

We also observe that the conclusion of the statement is written as an implication: 
If the limit of the quotient of the two derivatives exists, then the limit of the quotient 
of the two functions exists. The opposite implication is not true, as we can see in the 
following example. Let xo = 0, 


f(x) = x*sin (=), g(x) =x. 


Xx 


Then lim f(x) = lim g(x) = 0, 
x0 x0 


1 
ie 2) lim x sin (= ) ( 
x0 g(x) x0 x 


while 


f'@) _ fl 1 
— = 2x sin (=) —cos(—), 
g'(x) x x 
so that the limit lim,_,9 : : does not exist. 

As an example of the application of L’H6pital’s rule, let 7 = R, x9 = 0, f(x) = 
sinx — x, and g(x) = x?. Then, from 


f'(x) . cosx—1 1 
= lim ——~— = — 
x0 g' (x) x0 3x2 6 


we deduce that 


The following corollary can be useful in determining whether a function is 
differentiable at some point xo. 


6.6 LHépital’s Rules 149 


Corollary 6.20 Let I be an interval containing a point xo, and let f : I > Rbea 
continuous function, differentiable at all x & xo. If the limit 


l= lim f’(x) 
x>XxQ 
exists, then the derivative of f at xo exists, and it coincides with I. 


Proof Let F(x) = f(x) — f (xo) and G(x) = x — xo. We have that G’(x) # 0 for 
every x # Xo, 


lim F(x) = lim G(x) = 0, 
x—>xQ x—>X0 


and 
ve 


M23 Sie f@e=l. 


im —— 
x—>x0 G!(x) x—>x0 
By L’H6pital’s rule we have that 


i SOLOW We POY. 
im ———— = lim 1 


x>Xx0 x — Xo x>x9 G(x) =e 


ie., f’(x9) = 1. = 


L Ho6pital’s rule can be extended to cases where x9 = +00 or —oo. Let us analyze 
here the first case; the other one is analogous. 


Theorem 6.21 (L’H6pital Theorem—II) Let I be an interval, unbounded from 


above, and let f,g : I > R be two differentiable functions, with g'(x) 4 0 for 
every x € I, such that 


lim, £09 = lim, (0) = 0. 


If the limit 
f') 
HATO E(x) 
exists, then the limit 
fa) 


im 
at 8) 


also exists, and the two coincide. 


150 6 The Derivative 


7 
Proof Set 1 = re nee 


G(x) = a(x"), we see that G’(x) 4 0 for every x, and 


Defining the two functions F(x) = f (x—!) and 


lim, F(x) = lim, Gx) = 0. 
Ot 


x—>0+F 


Moreover, 


F@) _ i LEMC™ _ LEO yy LOX 


im = 
x—0+ G(x) x0+ g!(x7!)(—x-2) xs0+ g(x!) yoo g/(y) 


F(x 
Then, by Theorem 6.19, lim as = I, and hence 
x 


x>0t 


f@)_ 4 fue") _ fy PO 


im im = 
x—>+00 g(x) ~~ y+ g(u-!) wot Giu) 


thereby proving the result. | 


We will now state what is called “L’ H6pital’s rule in the indeterminate case = 2” 
In the following theorem, oo can be either +00 or —oo. 


Theorem 6.22 (L’H6pital Theorem—III) Let I be an interval containing a point 
xo, and let f, g : I \ {xo} > R be two differentiable functions, with g'(x) # 0 for 
every x € I \ {xo}, such that 

lim f(x) = lim g(x) = ow 

x>X0 x—>X0 
If the limit 


f'@) 


1m 
x>XQ g! (x) 


exists, then the limit 


f@) 
m 


xX—> x0 g(x) 


also exists, and the two coincide. 


6.6 LHdpital’s Rules 151 


f'() 


"(x 


Proof Seti = lim 
xX—>Xx0 
endpoint of J. Let ¢ > 0 be fixed. Then there exists a 5; > 0 such that 


. We first assume that / € R, and that xo is not the right 


£@) | ee 
/ me 


xo <x <x9+6, > 


By Cauchy’s Theorem 6.18, for every x € ]xo, xo + 6;[ there is a &. € ]x, x9 + di[ 
such that 


f'&) _ f@0+8) — FE) 
VE) g@0+5)— 8G)’ 


and hence 


f (xo + 61) — fx) € 
LD SONNE | ae |e 


We can moreover assume that 5; was chosen small enough so that 
xo <x <x9 +6, => f(x) £0 and g(x) 40. 


Let us write 


foto)y—f@) 9 fa) 
—— = v(x) 
g(xo + 61) — g(x) g(x) 
and observe that 
Jim, Te — f(xo+ S/FQ) _ 


fees 1 = g(xo +61)/g(x) 
In particular, 
1 E E € 
li ——(i-=)=1-5, I =) =! an 
Peay w(x) ( 2 2 pun — ( z= 2 


so that there exists a 6 € JO, 6;[ such that, if x9 < x < x9 + 6, then w(x) > O and 


wa (i-5) Ine, a (145) <i+e. 


Therefore, if x9 < x < x9 +4, then 


- = é 1 Fo +61) — f@) _ 1 
a7 ie r60+8)—2@) — Sao) = es 


152 6 The Derivative 


and hence 
FO). 4, 
g(x) 
We have thus proved that 
ja ay 
X>XG & (x) 


In a perfectly analogous way one proves that, if xo is not the left endpoint of J, then 


ia eS 
XX g(x) 


1) 


so that the theorem is proved in the case where / € R. 
Assume now that / = +00 and that xo is not the right endpoint of J. Leta > 0 
be fixed. Then there exists a 6; > 0 such that 


£@. by 
a(x) ~ 


x9 <x<xg9t+6, > 


Proceeding as previously, we have that 


fo + 61) — ff) Sp 


xo<x<x9+d, > => 
8 (x0 + 61) — g(x) 


We can, moreover, assume that 5; has been chosen small enough so that 
xo <x<x9+6) => f(x) #0 and g(x) £0. 
Let w(x) be defined as previously. There exists a 6 € JO, 61[ such that 
x<x<x+6 > 0< W(x) <2. 
Therefore, if x9 < x < x9 +4, then 


fot 61) — FO) , i ey. 
W(x) g(xo +51) — 8%) ~ WO) 


and hence 


f(x) ean 
g(x) ~ 


6.6 LHédpital’s Rules 153 


We have thus proved that 


f(x) _ 


xoxg g(x) i 


In a perfectly analogous way one proves that, if xo is not the left endpoint of J, then 


lim fo) = +00 
X>XQ g(x) 


so that the theorem is proved also in the case / = +00. Finally, the case ] = —oo 
can be ruled out observing that a change of sign in one of the two functions leads 
back to the previously proved case. a 


Example We want to compute 


lim xInx. 
x>0+ 


Setting f(x) = Inx and g(x) = 1/x, we see that lim, f(x) = -—o and 
x>0 
lim g(x) = +o. Moreover, 
x—0t 
f'(x) : 1/x 


= lim = lim (-x)=0. 
x 30+ g(x) x—0t —1/x? os ) 


Hence, also 


in env= ta eS 
x—>0t x—2>0t g(x) 


Even in the indeterminate case <= we can extend L’ H6pital’s rule to cases where 
X90 = +00 or —ov. Let us see, e.g., the first case. 


Theorem 6.23 (L’H6pital Theorem—IV) Let I be an interval, unbounded from 
above, and let f, g : I > R be two differentiable functions, with g'(x) 4 0 for 
every x € I, such that 


lim, £0) = lim, 20) = 00, 
If the limit 


f'(x) 


x$00 g(x) 


154 6 The Derivative 


exists, then the limit 


f(x) 


im 
TOS BAM) 


also exists, and the two coincide. 


The proof is analogous to that of Theorem 6.21, so we omit it for brevity’s sake. 


6.7.‘ Taylor Formula 


The following theorem provides us the so-called “Taylor formula with Lagrange’s 
form of the remainder.” 


Theorem 6.24 (Taylor Theorem—I) Let x 4 xo be two points of an interval I 
and f : I > Rben +1 times differentiable. Then there exists a & €|xo, x[ such 
that 

f(&) = Pn) +r), 
where 


1 1 
Pal) = f(x0) + fo) = x0) + FF" eo) — xo)? tet — (xo) (x — xo)" 


is the “nth-order Taylor polynomial associated with the function f at the point xo,” 
and 


f° POG — x)" 


rn(x) = 


(n+ 1)! 


is the “Lagrange form of the remainder.” 


Proof We first observe that the polynomial p, satisfies the following properties: 


Pn (xo) = F (xo) , 
Py (xo) = f’ (x0) » 
Py (xo) = f” (xo), 


ve” (x0) = f (x0). 


6.7 Taylor Formula 155 


By Cauchy’s Theorem 6.18, we can find a point & € ]xo, x[ such that 
f®) = Pn) _ (F(X) = pn) — (Ff 0) — Pn(xo)) 
Gay ~* sea g =paag 


_ _f'@) — PE) 
(n+ 1)(&1 — x0)" 


Again by Cauchy’s Theorem 6.18, we can find a point & € ]xo, &1[ such that 


Fi = 2,61) = (f'(&1) — pi, (&1)) — (Fo) — pi, (%0)) 
(n + 1)(€1 — x0)" (n + 1)(&1 — x0)" — (2 + 1)(%0 — x0)” 


_. Fy = we) 
~ (nt In, — x0)" 1 


Proceeding by induction, we find n + 1 points &1, &2,..., &,41 such that 


F(x) = Pax) _ FE) = Pn (E) 
= xq)? (n + 1) G1 — xo)” 


= f" (&) — ph &) 
(n + I)n(& — xo)""! 


_ FD Ge) — Be? Ere) 
(DME 40)" 


If x > xo, these points satisfy the inequalities 
x0 < En41 <& <---<& <& <x, 


whereas if x < xg, they are in the opposite order. Since the (nm + 1)th derivative of 
an nth-order polynomial is constantly equal to zero, we have that party (€n41) = 0, 
and setting € = &,41 we conclude. a 


If n = O, then the preceding Taylor Formula is simply 
fx) = fo) + FE) — x0), — for some & € Jxo, x1, 


which is the outcome of Lagrange’s Theorem 6.11. 
Note that the Taylor polynomial 


n ¢(h) 
mo= ot a (x — x0) (6.3) 
k=0 : 


156 6 The Derivative 


could have a degree smaller than n (here f simply denotes f). For example, if f 
is a constant function, then the degree of p, (x) is equal to 0. 


Examples Let us now determine the Taylor polynomial of some elementary func- 
tions, taking for simplicity x9 = 0 (in which case it is sometimes called a “Maclaurin 
polynomial’). 


1. 


Nn 


Let f(x) = e*. Then 


x3 


xk 
pr)atte+ eee. Pte 
k=0 
. Let f(x) = cos x. Then, if eithern = 2m orn = 2m + 1, 
2 4 an 2m 2k 
x x x x 
Sp. gs pare a m _ _4)k 
ee or ay “ee ee a => Cpl 
. Let f(x) = sinx. Then, if eithern = 2m + 1 orn = 2m 4 2, 
3 5 7 x2mt+1 ue 2k+1 
x x x x 
ed a ae cee? a 
POSH ar Ta CY Ona! = 26}  @k+D! 
. Now let f(x) = 7—. It can be shown by induction that 
a n! 
fer (x) = d—xysT" 


Then f (0) =n!, and hence 


POV lahat ee, 


. We proceed similarly for the function f(x) = ~— and find 


= es 


PS Se Pee 


. Consider now the function f(x) = In( + x). Its derivative coincides with the 


previous function, and we easily obtain 


2 x3 4 


x x x” 
ep eh a a ch 
Pn(x) =x a 3 ri +---+(-1) ; 


6.7 Taylor Formula 157 


7. Another example where the Taylor polynomial has an explicit formula is given 


by the function f(x) = meee If either n = 2m orn = 2m + 1, then we have 


pax) =1—x? x4 x8 H.-C", 


8. At this point it is easy to deal with the function f(x) = arctan x, whose derivative 
is the previous function. If either n = 2m + 1 orn = 2m + 2, then 


3 x> | 2m+1 


@)=x-D+D- St +" 
Sma: ae aa 2m +1 


In a similar way we can find the Taylor polynomials of the hyperbolic functions 
cosh x, sinh x, and that of tanh7! x. The following summary table may be useful: 


f(x) | pn (x) at the point x9 = 0 
2 x3 n 


x x x 
e a acre a aa; 
Gilg ee = koa Da 
2 3 4 n 
yet. gh ug i) wom 
COS x Lp are er ee a 
: x? xx? mn xem 
sin x Rot eh a eT 
x3 Dy? 7 y2m+1 
arctan x BS Sa ee te AE) AeA 
x2 x4 x6 2m 
cosh x Loh art ert om 
. x3 x? x! x2mt 
sinh x ttatatat +oaap 
4 ee ae m+ 
tanh se ee i re 


On the other hand, there is no elementary expression for the Taylor polynomial of 
the functions tan x and tanh x. We report here only the first few terms: 


; ye 2 x! 
anx |x —_— apr SpE ieaters 
3 15 315 
x 2x9 17x7 
tanh x KS —S— 


3 15 g15.7° °°: 


158 6 The Derivative 


6.8 Local Maxima and Minima 


Assuming xo to be fixed, we would like to take some limit in the Taylor formula as x 
tends to x9. Hence, for every x ~ xo, to emphasize the fact that the point € € ]xo, x[ 
in the Taylor formula depends on x, we will write € = é,. 

Whenever f+! happens to be bounded in a neighbourhood of xo, we see that 


ee ae en eee 


x>x0 (x — x0)" xx m+! + —. 
This relation is sometimes written using the following notation: 
rn(x) =0(|x —x0|") if x > x0. 
This is surely true if f+) is continuous at xo, in which case 


n(x) = (n+1) (n+1) 
ie pt ga = Ga 


Theorem 6.25 Let f € C7(I,R), and assume that xo is an internal point of I. If 
f' (xo) = O and f" (xo) > 0, then xg is a local minimum point for f. On the other 
hand, if f' (xo) = 0 and f" (xo) < 0, then xo is a local maximum point for f. 


Proof Let us prove the first statement; the second one is analogous. Using the Taylor 

formula with n = 1, we have that 
tm 22a IGO _ 4 FO-—UGo+ f' (x0) = x0)] 
x>x (X— x0) x>Xx0 (x — x0) 


spin 2 ass = =f" a0) > 0. 


x>x0 (X — x0)? 


Hence, using sign permanence, there is a neighborhood U of xg such that f(x) > 
J (xo) for every x € U \ {xo}. 


Whenever f’(x9) = Oand f” (x9) = 0, we will need further information. Always 
assuming that xo is an internal point of J and that f is sufficiently regular, it can 
be seen that if f’’(x9) 4 0, then xg will be neither a local minimum nor a local 
maximum point for f. On the other hand, if f’” (x9) = 0, then 


if f'(x0) = f" (x0) =f" (X0) = Oand f"" (xo) > 0, 


then xg is a local minimum point, 


6.9 Analyticity of Some Elementary Functions 159 


whereas 


if f'(x0) = f" (x0) = f" (xo) = Oand f"""(xo) < 0, 


then xo is a local maximum point. 


This procedure can be continued, of course, but we avoid the details for brevity’s 
sake. 


6.9 Analyticity of Some Elementary Functions 


Somewhat surprisingly, the Taylor polynomial at x9 may be a good approximation 
of a function even at distant points x if the degree is taken large enough. An example 
follows, provided by the exponential function. 


Theorem 6.26 For every x € R, we have that 


xy 1 x2 x x 
eee TT eh ay ony . 


Proof The formula clearly holds when x = 0. Assuming x 4 0, by Taylor’s 
Theorem 6.24 there exists a € € ]0, x[ such that f(x) = pa(x) +rp(x), with 

£ xt 

In(x) =e @tD! TD! 


We want to prove that lim r, (x) = 0. Notice that for any x € R, 
n 


|x [rd 
(n+ 1)! 


Irn(x)| < el! 


n 

Since we proved in Theorem 3.30 that lim - = 0 for every a € R, the conclusion 
n n! 

follows. a 


Instead of 


we will briefly write 


160 6 The Derivative 


This is the “Taylor series” associated with the exponential function at the point 
xo = 0. 
A similar phenomenon holds for the cosine and sine functions. 


Theorem 6.27 For every x € R, we have that 


2 4 6 2m 
cosx = lim eae se aes +(-1)” , 
m 2) 4! 6! (2m)! 
; = x3 x) x7 yn xem 
pe ag Wages gh omy 


Proof Following the lines of the previous proof and using the fact that | cosé| < 1 
and | siné| < 1 for every € € R, we see that 


[eet 
Irn (x)| < +l)! 
aq” 
Since lim = 0 for every a € R, the conclusion follows. | 
n ni 
We will briefly write 
: x2 &2 ‘ 2k+1 
cosx = Ye 1)*—— Ob! sinx = aS 1) Oki DI" 
Similarly, one can prove that 
ede ee a 
coshx = }>—_, sinhx = ) > ——__. 
a (2k)! ae (2k + 1)! 


The functions f ¢ C°(U,R) for which f(x) = lim, py(x) for every x € I 
are called “analytic” on J. This is not the case for every function. For instance, the 
function f : R > R, defined as 


es ity £0, 
FQ) = eb 
ifx =0, 

is infinitely differentiable, and f“(0) = 0 for every k € N, hence p,(x) is 
identically equal to zero. The reader is invited to verify this. 


® 


Check for 
updates 


The Integral 7 


In this chapter, we denote by J a compact interval of the real line R, i.e., 


IT=[a,b], forsomea <b. 


7.1 Riemann Sums 


First, we choose in J some points 
a=dy <a <-++:<dmn_-1 < an =), 


thereby obtaining a “partition” of 7 made by the intervals [aj_1,a;], with 
j =1,...,m. Then, for each j, we choose a point 


x; € [aj-1, aj). 
A “tagged partition” of I is the set 


P= {(x. [ao, a\]) ee can [am—1, aml) . 


Examples Let I = [0, 1]. Here are some tagged partitions of /: 


P= {(t.10.1)| 
e=((0.[.4]). ED] 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 161 
A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978- 3-03 1-23713-3_7 


162 7 The Integral 


#=((4o4)) GLa) - GD] 
= {(s[oa)) CL) GLAD GLAD) 


We now consider a function f : J > R. For each tagged partition P as above 
we define the number 


TS) 


m 


SG,P) = a f(xj)(aj — aj-1), 


j=l 


which is called the “Riemann sum” associated with f and P. 

To better understand this definition, assume for simplicity that the function f is 
positive on J. Then to each tagged partition of J we associate the sum of the areas 
of the rectangles having base [a;_1, aj] and height [0, f(x;)]. 


If f is not positive on J, the areas will be considered with a positive or negative 
sign depending on whether f (x ;) is positive or negative, respectively. If f(x;) = 0, 
the jth term of the sum will clearly be equal to zero. 


Example Let f : [0, 1] — R be defined as f(x) = 4x? — 1, and let 


P=(([o4)-GLa) GED] 


Then 


7.2 6-Fine Tagged Partitions 163 


7.2 __§-Fine Tagged Partitions 


To measure how “fine” a tagged partition is, we will have to deal with a “gauge”, 
ie., a positive function 6: I > R. 

If 5 is a gauge on J, we say that the tagged partition P is “d-fine” if, for every 
jJ=1,...,m, 


Xj — aj-1 < d(x;) and aj —xj < d(x); 
equivalently, we may write 
laj-1,a;] © [xj — 6(xj), xj + 6(x;)). 


We will now show that it is always possible to find a 6-fine tagged partition of 
the compact interval J, whatever the gauge 6. 


Theorem 7.1 (Cousin Theorem) For every gauge 6 on I = [a, b] there is a 6-fine 
tagged partition of I. 


Proof Set Io = I, and assume by contradiction that there exists a gauge 6 : Jy > R 
for which there are no 4-fine tagged partitions. Taking the midpoint of Jo, we divide 
it in two closed subintervals. At least one of these two subintervals will not have 
any 6-fine tagged partition (otherwise we could glue together the two 6-fine tagged 
partitions to get a 6-fine tagged partition of the original interval Jo). Let us choose 
it and denote it by /;. We now iterate the same procedure, thereby constructing a 
sequence 


T=hDhDIh2Iho2D... 


of closed subintervals, none of which has any 6-fine tagged partitions. By Cantor’s 
Theorem 1.9, there is a point c belonging to all of these intervals. For n sufficiently 
large, J, will be contained in [c — 6(c), c + 6(c)]. But then the set P= {(c, In)}, 
whose only element is the couple (c, J,), is a 6-fine tagged partition of I,, a 
contradiction. | 


Examples Let us provide some examples of 6-fine tagged partitions of the interval 
I = (0, 1]. 


We start with a constant gauge: d(x) = 5: Since the previous theorem does 
not give any information on how to find a 6-fine tagged partition, we will proceed 
by guessing. As a first guess, we choose the a; equally spaced and the x; as the 
midpoints of the intervals [a;—1, a;], i.e., 


_ @j-1ta; a2j-i1 


’ Xj —_— 
2 2m 


164 7 The Integral 


For the corresponding tagged partition to be 6-fine, it must be that 


These inequalities are satisfied choosing m > 3. Ifm = 3, we have the 6-fine tagged 


partition 
5 I 1 1 [1 2 5 [2 
P={(5-[03])-C-[5-8])- GLAD 
If, instead of taking the points x; in the middle of the respective intervals we would 


like to choose them, for example, at the left endpoint, ie., x; = i then in order 
to have a 6-fine tagged partition we should ask that 


0< 1 4 1 a 1 

j; —aAj-| = = an aj- xj = Se 

xj — Gj-1 = ja Xj oo Ss 

These inequalities are verified if m > 5. For instance, if m = 5, then we have the 
6-fine tagged partition 


p= ((ofoa)- Gla) @3)- CED) GE DL 


Notice that, with such a choice of a;, if m = 5, then the points x; can actually 
be taken arbitrarily in the corresponding intervals [a;_1, a;], still yielding 5-fine 
tagged partitions. 

The previous example shows how it is possible to construct 6-fine tagged 
partitions in the case of a gauge 6 that is constant with value : It is clear 
that a similar procedure can be used for a constant gauge with arbitrary positive 
value. Consider now the case where 6 is a continuous function. Then Weierstrass’ 
Theorem 4.10 says that 5(x) has a minimum positive value: let it be 6. Consider 
then the constant gauge with value 5, and construct a 5-fine tagged partition with 
the procedure we saw earlier. Clearly, such a tagged partition must be 5-fine as well. 
This argument shows how the case of a continuous gauge can be reduced to that of 
a constant gauge. 

Consider now the noncontinuous gauge 


ifx =0, 


} = 
oy) ifx €]0, 1]. 


NIe NI 


As previously, we proceed by guessing. Let us try, as earlier, taking the a; equally 
distant and the x; as the midpoints of the intervals [a;-1, aj]. This time, however, 


7.3 Integrable Functions on a Compact Interval 165 


we are going to fail; indeed, we should have 


x1 
MS t= CS OO) Ss 


which is clearly impossible if x; > 0. The only way to solve this problem is to 
choose x; = 0. We decide, then, for instance, to take the x; to coincide with aj;_1, 
as was also done earlier. We thus find the 6-fine tagged partition 


P=((os)-GLa)-GR DP 


Notice that a more economic choice might have been 


P= {(0.[0.5])-(.[b Dp 


The choice x; = 0 is, however, unavoidable. 
Finally, once a point c € ]0, 1[ is fixed, let the gauge 6 : [0,1] > R be defined 
as 


if x € [0,c[, 


ifx=c, 


if x €]c, 1]. 


Similar considerations to those made in the previous case lead to the conclusion 
that, in order to have a 6-fine tagged partition, it is necessary for one of the x; to be 
equal to c. For example, if c = 5; a possible choice is 


= (fod) GED -GLD-@laD- CBD) 


7.3. Integrable Functions on a Compact Interval 


We now want to define some kind of convergence of the Riemann sums when 
the tagged partitions become “finer and finer’. The following definition is due to 
Jaroslav Kurzweil and Ralph Henstock. 


Definition 7.2 A function f : I — R is said to be “integrable” if there is a real 


number 7 with the following property: Given ¢ > 0, it is possible to find a gauge 
56: I — R such that, for every 6-fine tagged partition P of /, 


ISGP) -—JI|<e. 


166 7 The Integral 


We will also say that f is “integrable on J.” 


Let us prove that there is at most one 7 € R that verifies the conditions of the 
definition. If there were a second one, say, 7’, then, for every ¢ > 0, there would 
be two gauges 6 and 6’ on J associated respectively with 7 and 7’, satisfying the 
condition of the definition. Define the gauge 


6” (x) = min{5(x), 6'(x)}. 


Once a 5"-fine tagged partition P of Lis chosen, we have that P is both 6-fine and 
6’-fine, and hence 


IJ -—F'1<|F —SP)|+ ISGP) — F'| < 2¢. 


Since this holds for every ¢ > 0, it necessarily must be that 7 = 7’. 

If f : 1 — Ris an integrable function, the only element 7 € R verifying the 
conditions of the definition is called the “integral” of f on J and is denoted by one 
of the following symbols: 


ies flee [ fear, [ sooa. 


The presence of the letter x in the preceding notation has no independent impor- 
tance. It could be replaced by any other letter t, uv, a, ..., or by any other symbol, 
unless already used with another meaning. For reasons to be explained later on, we 


set, moreover, 
a b a 
[r--[ f, and [r= 
b a a 


Examples 


1. As a first example, consider a constant function f(x) = c. In this case, for any 
tagged partition P of [a, b], 


m 


S(f,P) = eG —aj-1)= CSG —aj-1)=c(b-a), 


j=l j=l 


hence also 


b 
/ cdx =c(b—a). 


7.3 Integrable Functions on a Compact Interval 167 


Indeed, once we fix ¢ > 0, it is readily seen in this simple case that any gauge 
5: [a,b] > R satisfies the condition of the definition, with J = c(b — a), since 
for every 5-fine tagged partition P of I we have ISCf, P)-J|=0<e. 

2. As a second example, consider the function f(x) = x. Then 


S(f.P) = Y>xj(aj —aj-1). 


To find a candidate for the integral, let us consider a particular tagged partition 
where the x; are the midpoints of the intervals [a;—1, a;]. In this particular case, 
we have 


m 1 
Vienne ae aj —aj-1)=> DL aj_4)= 5 (b°-a"). 
j=1 j=l 


We want to prove new that the function f(x) = x is integrable on [a, 5] and that 
its integral is really 5 1 (ph? — a). Fix ¢ > 0. For any tagged partition P we have 


° aj-, +a; 
sur P)- asec —aj-1)- 2 aj -4j-1) 
pal 
m 
aj-1 +a; 

< > Xj a ca (aj — aj-1) 
j=l 
il aj — aj-\ 

= eee 5 7 (aj —aj-1). 
j=l 

tagged partition P we have 

m 


<> S«j- aj- a hoe =e: 


j=l 


s(f.P) - 50 ee 


The condition of the definition is thus verified with this choice of the gauge, and 
we have proved that 


b 1 
i xdx = —(b* — a’). 
Y 2 


168 7 The Integral 


7.4 Elementary Properties of the Integral 


Let f : J > Rand g: J — R be two real functions and a € R a constant. It is 
easy to verify that for every tagged partition P of J, 


S(f +8,P) = SUF,P) + S(g, P) 
and 
S(af, P) =aS(f,P). 


These linearity properties are inherited by the integral, as will be proved in the 
following two propositions. 


Proposition 7.3 If f and g are integrable on I, then f + g is integrable on I and 


firo-frefe. 


Proof Set Ji = F f and Jz = ty g. Once ¢ > 0 is fixed, there are two gauges 61 
and 62 on J such that, for every tagged partition P of I , if P isd 1 -fine, then 


ISLP)- Sls 5. 


whereas if P is 62-fine, then 


a) 


IS(g, P) - Al < a 


Let us define the gauge 6 : J > Ras 6(x) = min{d (x), 52(x)}. Let P be a 6-fine 
tagged partition of 7. It is thus both 61-fine and 62-fine, hence 


ISf+e,P)-(A+ Al =|S,P) -— N+ S(g, P) — Al 


IS, P) — Al + |S(g, P) — Pal 
< ced 
Dn De 


IA 


This completes the proof. | 


Proposition 7.4 If f is integrable on I anda € R, thenaf is integrable on I and 


fen=efr. 


7.4 Elementary Properties of the Integral 169 
Proof If « = 0, then the identity is surely true. If a 4 0, then set J = [ , J and fix 
€ > 0. There is a gauge 6 on J such that 


SiR) = Tie 


a| 


for every 6-fine tagged partition P of I. Then, for every 6-fine tagged partition P of 
T, we have 


E 
—=6, 
|| 


|S(af,P) — a | = |aS(f.P) — wJ| = lal |S. P) — JI < lal 
and the proof is thus completed. | 

We have just proved that the set of integrable functions is a real vector space and 
that the integral is a linear function on it. 

We now study the behavior of the integral with respect to the order relation in R. 


Proposition 7.5 If f is integrable on I and f (x) => 0 for every x € I, then 


[rz0. 


Proof Fix ¢ > 0. There is a gauge 6 on J such that 


<€é 


sam fr 
for every 6-fine tagged partition P of I. Hence, 
[ 2s. P-e2-e, 

I 


since clearly S(f, P) = 0. Since this is true for every € > 0, it must be that [ J 
0, thereby proving the result. 


Corollary 7.6 If f and g are integrable on I and f(x) < g(x) for every x € TI, 


then 
[rsfe. 


Proof It is sufficient to apply the preceding proposition to the functiong — f. 


170 


Corollary 7.7 If f and | f | are integrable on I, then 


fs < fifi. 


Proof Applying the preceding corollary to the inequalities 


Sei ees 


-firisfrs fos. 


we have 


whence the conclusion. 


7.5 The Fundamental Theorem 


7 The Integral 


The following theorem establishes an unexpected link between differential and 
integral calculus. It is called the Fundamental Theorem of differential and integral 


calculus. 


Theorem 7.8 (Fundamental Theorem—I) Let F : [a,b] — R be a differen- 
tiable function, and let f be its derivative: F'(x) = f (x) for every x € [a, b]. Then 


f is integrable on (a, b] and 


b 
‘ f = F(b) — F(a). 


Proof Let ¢ > 0 be fixed. We know that for every x € [a, b], 


f(x) = F'(x) = lim 
u>x u—x 


F(u) — F(x) 


Then for every x € [a, b], there is a d(x) > 0 such that, for every u € [a, b], 


Flu) — FQ) 


0 < |u—x| < d(x) => |— — - f(x) eo), 
u—-x b-a 


lu—x| <8) > |F@—F@)-—f@uU-xIs 


We have thus defined a gauge 5: [a, b] > R. 


ju—x|. 


7.5 The Fundamental Theorem 171 


Consider now a 6-fine tagged partition of /, 
P = {(x1, [ao, a1]),---5 Gms [am—1, am])} - 
Since, for every j = 1,...,m, 
laj-1—xjl<6(xj) and aj —xj| < (xj), 
we have that 
|F (aj) — F(aj-1) — f(%s)(aj — aj-1)| 
= |lF@) - Fey) - f@)@;- 4) 
+LF (x) — Faj-1) + FO )@-1 — x))]| 
< |F(aj) — Fy) — f@j(aj — x) 
+|F(aj-1) — F(xj) — f(xj)(aj-1 — x,)| 


= 


g 
a laj —xj|+ reer aes — x; 


é 
ep B=); 
i al 


E 
= (Qa ay tsp) = 


b-a 


Hence, 


m 


DIF @) -— F@-vl- D0 f@)@;j - aj-1) 


j=l F=1 


|F(b) — F(a) — S(f,P)| = 


IF (aj) — F@j-1) — f(«p)@ - 4j-1)1 


j=l 


m | 


< 2 |F(@j) — F@j-1) — f@(@ — 4-0) 


j=l 


Fe 
<) (aj —aj-1) =e, 
; ,b-a i 
i= 


and the theorem is proved. | 


172 7 The Integral 


7.6 Primitivable Functions 


In this section and the following one, we denote by Z any interval in R (not 
necessarily a compact interval). 

A function f : Z — R is said to be “primitivable” (or “primitivable on 7”) if 
there is a differentiable function F : Z — R such that F’(x) = f(x) for every 
x € Z. Such a function F is called a “primitive” of f. 

The Fundamental Theorem establishes that all primitivable functions defined 
on a compact interval J = [a,b] are integrable and that their integral is easily 
computable once a primitive is known. It can be reformulated as follows. 


Theorem 7.9 (Fundamental Theorem—II) Let f : [a,b] — Rbeaprimitivable 
function, and let F be one of its primitives. Then f is integrable on [a, b| and 


b 
i f=F(b)-F@). 
a 
It is sometimes useful to denote the difference F(b) — F(a) by the symbols 


(Fy. (PCR. 


a? 
or variants of these, for instance [F @ie , when no ambiguities can arise. 


=o 
n+l 


Example Consider the function f(x) = x”. It is easy to see that F(x) = gael 


is a primitive. The Fundamental Theorem tells us that 


b 1 1 
/ x” dx = yt em ee = gir) : 
4 n+1 qa util 


The fact that the difference F (b) — F(a) does not depend on the chosen primitive 
is explained by the following proposition. 


Proposition 7.10 Let f : Z — R be a primitivable function, and let F be one of 
its primitives. Then a function G : I — R is a primitive of f if and only if F — G 
is a constant function on T. 

Proof If F — G is constant, then 


G'(x) = (F — (F —G))'(x) = F'(x) — (F — G)'(x) = F(x) = fF) 


for every x € Z, and hence G is a primitive of f. On the other hand, if G is a 
primitive of f, then we have 


(F — G)'(x) = F(x) — G(x) = f(x) — f(x) =0 


7.6 Primitivable Functions 173 


for every x € Z. Consequently, F — G is constant on Z. 8 
Note that if f : Z — R is a primitivable function, then it is also primitivable on 


every subinterval of Z. In particular, it is integrable on every interval [a,x] C TZ, 
and therefore it is possible to define a function 


oe fs. 


which we call the “integral function” of f and denote by one of the following 


symbols: 
fr frou, 


In this last notation it is convenient to use a letter other than x for the variable of /; 
for instance, here we have chosen the letter t. The Fundamental Theorem tells us 
that if F is a primitive of f, then 


[f=Fe-Fe. for every x € [a, b]. 


We thus see that cE f differs from F(x) by a constant, whence the following 
corollary. 


Corollary 7.11 Let f : [a,b] — R be a primitivable function. Then the integral 
function ip f is one of its primitives; it is differentiable on [a, b] and 


(fre = f(x), foreveryx € [a,b]. 


Notice that the choice of the point a in the definition of _ J is not at all 
mandatory. If f : Z — R is primitivable, one could take any point w € Z and 
consider the function I f. The conventions made on the integral with exchanged 
endpoints are such that the previously stated corollary still holds with this new 
integral function. Indeed, if F is a primitive of f, even if x < w, then we have 


/ s=-f f =—-(F@) — F@)) = F@)— F@), 


so that 1 f is still a primitive of f. We can then write 


is Ts f =f), © or, equivalently, an f@dt = f(x). 
dx Jw dx Jo 


174 7 The Integral 


This formula can be generalized; if a, B : [a,b] — R are two differentiable 
functions, then 


d e® 
=| f(t) dt = f(B(x))B'(x) — f (a(x) (x). 
X Ja(x) 


Indeed, if F is a primitive of f, then the preceding formula is easily obtained by 
writing [2 f(t) dt = F(B(x)) — F(a(x)) and differentiating. 


a(x) 
We will denote the set of all primitives of f by one of the following symbols: 


fr f teva. 


One should be careful with the notation { introduced for the primitives, which looks 
similar to that for the integral, even if the two concepts are completely different. 
Concerning the use of x, an observation analogous to the one made for the integral 
can be made here, as well: it can be replaced by any other letter or symbol, with due 
precaution. When applying the theory to practical problems, however, if F denotes 
a primitive of f instead of correctly writing 


[fatrterceR), 
it is common to use improper expressions of the type 
[feos =F®) +e, 


where c € R stands for an arbitrary constant; we will adapt to this habit, too. Let us 
make a list of primitives of some elementary functions: 


[eacaere, 


sinx dx = —cosx+c, 


[cosas =sinx+c, 


zat 
[tar= ey witha 4-1, 
1 
[eax =inisite, 
x 


7.6 Primitivable Functions 175 


1 
/ dx =arctanx+c, 
1+x? 


i; er =arcesinx +c. 

V1 — x2 

Notice that the definition of primitivable function makes sense even in some cases 
where f is not necessarily defined on an interval, and indeed the preceding formulas 
should be interpreted on the natural domains of the considered functions. For 
example, 


[re He if x €]0, +oo[ 


x In(-x) +c ifx €]—o,O. 


Example Using the Fundamental Theorem we find 
1s 
/ sinx dx = [—cosx]j = —cosma + cos0 = 2. 
0 


Notice that the presence of the arbitrary constant c can sometimes lead to 


apparently different results. For example, we know that [ ; =dx = arcsinx +c, 
—x2 


but it is readily verified that we also have 


= —arccosx+c. 


1 
———. dx 
ls 


This is explained by the fact that arcsinx = 4 — arccosx for every x € [—1, 1], 
hence the difference of arcsin and — arccos is constant. The same notation c for the 


arbitrary constant in the two formulas could sometimes be misleading! 


From the known properties of derivatives we can easily prove the following two 
propositions. 


Proposition 7.12 Let f and g be primitivable on T, and let F and G be two 
corresponding primitives. Then f + g is primitivable on 1, and F + G is one of its 
primitives; we will briefly write! 


futo=frefe. 


' Here and in what follows, we use in an intuitive way the algebraic operations involving sets. To 
be precise, the sum of two sets A and B is defined as 


A+B={a+b:aeA,beB}. 


176 7 The Integral 


Proposition 7.13 Let f be primitivable on T, and let F be one of its primitives. 
If a € R is any given constant, then af is primitivable on LT, and aF is one of its 


primitives; we will briefly write 
fh (af) =a / a 


As a consequence of these propositions, we have that the set of primitivable 
functions on Z is a real vector space. 

We conclude this section by presenting an interesting class of integrable func- 
tions that are not primitivable. Let the function f : [a,b] — R be such that the 
set 


E = {x € [a,b]: f(x) £90} 


is finite or countable (for instance, a function that is zero everywhere except at a 
point, or the Dirichlet function D : [a,b] — R, defined by D(x) = 1 if x is 
rational, and D(x) = 0 if x is irrational). 


Let us prove that such a function is integrable, with fp f = 0. Assume for 
definiteness that E is infinite (the case where E is finite can be treated in an 
analogous way). Since it is countable, we can write E = {e, : n € N}. Once 


€ > Ohas been fixed, we construct a gauge 6 on [a, b] as follows. If x ¢ E, then we 
set d(x) = 1; if, instead, for a certain n we have x = e,, then we set 


€ 
r) = —,—_—_.. 
() = SFE 
Now let P = {(%1, [ao, a1]),.--, (Xm; [€m—1, Gn])} be a 6-fine tagged partition of 


[a, b]. By the way in which f is defined, the associated Riemann sum becomes 


SiP)= Yo fep(aj-aj-1). 


{l<jsm:xj;eF} 


Let N = max{l < j <m: x; € E}. Since [aj-1, aj] C [vj — d(x;), xj + 6(xy)], 
we have that aj —aj—1 < 26(x;), and if x; is in E, it must be that x; = e, for some 
n €N. To any such e, can, however, correspond one or two points x;, so that we 
will have 


N 
< 2501 f(n)|25(en) 


n=0 


> f (xj) (aj — aj—-1) 


{I<j<m:xjeEF} 


7.7 Primitivation by Parts and by Substitution 177 


This shows that f is integrable on [a, b] and that bs f =0. 

Let us see now that if E is nonempty, then / is not primitivable on [a, b]. Indeed, 
if it were, its integral function , J should be one of its primitives. But the foregoing 
procedure shows that ie f = 0 for every x € [a,b]. Then f should be identically 
zero, being the derivative of a constant function, a contradiction. 


7.7. __ Primitivation by Parts and by Substitution 


We now present two methods frequently used for finding the primitives of certain 
functions. The first one is known as the method of “primitivation by parts.” 


Proposition 7.14 Let F,G :Z — R be two differentiable functions, and let f, g be 


the corresponding derivatives. One has that f G is primitivable on T if and only if 
Fg is, in which case a primitive of f G is obtained subtracting from FG a primitive 


of Fg; we will briefly write 
[soqra- | re. 


Proof Since F and G are differentiable, then so is FG, and we have 
(FG)' = fG+ Fg. 


Hence, f G+ Fg is primitivable on Z with primitive F'G, and the conclusion follows 
from Proposition 7.12. | 


Example We would like to find a primitive of the function h(x) = xe*. Define 


the following functions: f(x) = e*, G(x) = x, and consequently F(x) = e’, 


g(x) = 1. Applying the formula given by the foregoing proposition, we have 
fexas = etx— f eas =xe* —e* +c, 


where c stands, as usual, for an arbitrary constant. 


As an immediate consequence of Proposition 7.14, we have the rule of “integra- 
tion by parts”: 


b 


b 
[ 16=FHcw-race - | Fg. 


178 7 The Integral 


x 


Examples Applying the formula to the function h(x) = xe 
example, we compute 


of the previous 


1 1 
i exdx=el-1-2.0- f e* dx =e—[e*]h =e—(e! —e*) = 1. 
0 0 
Note that we could have obtained the same result using the Fundamental Theorem; 
having already found earlier that a primitive of h is given by H(x) = xe* — e*, we 


have that 


1 
i e*xdx = H(1)— H(0)=(e-—e)-O-lD=1. 
0 


Let us consider some additional examples. Let h(x) = sin? x. With the obvious 
choice of the functions f and G, we find 


[ sn?xas = —cos.xsina + f cos*xds 
= —cosx sinx + fia — sin? x) dx 
= x= cosssina = f sin?xas, 
from which we obtain 
2 1 ; 
sin’ xdx = ha —cosxsinx) +c. 


Consider now the case of the function h(x) = Inx, with x > 0. To apply the 
formula of primitivation by parts, we choose the functions f(x) = 1, G(x) = Inx. 
In this way, we find 


1 
fooxaxsxine = fxlax=xinx~ fide =xinx— xe. 
x 


The second method we want to study is known as the method of “primitivation 
by substitution.” 


Proposition 7.15 Let g : Z — R be a differentiable function and f : p(Z) > R 


be a primitivable function on the interval p(Z), with primitive F. Then the function 
(f 0 g)¢’ is primitivable on T, and one of its primitives is given by F 0 y. We will 


briefly write 
[rows = (/) Og. 


7.7 Primitivation by Parts and by Substitution 179 
Proof The function F o ¢ is differentiable on Z and 

(Fog) =(F' ogg’ =(f ogg’. 
It follows that (f 0 ¢)g’ is primitivable on Z, with primitive F o ¢. a 


As an example, we look for a primitive of the function h(x) = xe, Defining 
g(x) = 7, f® = xe! (it is advisable to use different letters to indicate the 
variables of y and f), we have that h = (f og)’. Since a primitive of f is seen to 
be F(t) = xe", a primitive of h is F 0, ie., 


1 
per dx = F(g(x))+c= =e tc. 
The formula of primitivation by substitution is often written in the form 


’ 


[ fece'eas = [roa 


t=9(x) 


where, if F is a primitive of f, the term on the right-hand side should be read 


[roa 


Formally, there is a “change of variable” t = g(x), and the symbol dt joins the 
game to replace y’(x) dx (the Leibniz notation ae = g'(x) is a useful mnemonic 
rule). 


= Fi(t)+ec 
t=9(x) 


= F(g(x)) +c. 
t=9(x) 


Example To find a primitive of the function h(x) = 2 we can choose g(x) = Inx 
and apply the formula 


1 
= fra 


1 2 
+c = -(Inx)“+c. 
x 2 


t=Inx t=Inx 


In this case, writing tf = Inx, we have that the symbol dt replaces tdx. 


As a consequence of the preceding formulas, we have the rule of “integration by 
substitution”: 


b g(b) 
[ seme war= fo roa. 
a g(a 


180 7 The Integral 


Indeed, if F is a primitive of f on g(Z), applying the Fundamental Theorem twice, 
we have 


g(b) 


b 
i (f ogg! = (F 0 g)(b) — (F 0 9)@ = F(v(b)) — F(v(a)) =f f. 
a gy 


(a) 


Example Taking the function h(x) = xe* defined previously, we have 


2 4 4 
1 1 -1 
i xe dx = | ~e' dt = —[e']j = 2 : 
0 0 2 2 2 


Clearly, the same result is obtainable directly by the Fundamental Theorem once we 


know that a primitive of h is given by H(x) = ae Indeed, we have 


ee: lig Teg. et=4 
xe” dx = H(2)— H(0) = -e"- -e = ‘ 
0 2 2 2 


When the function g : Z > g(Z) is invertible, we can also write 


’ 


x=¢7!(t) 


[roa= [ Fecone'eax 


with the corresponding formula for the integral: 


(a 


B gy '(B) 
[ toa= f° | Flecoye' dx. 
a Q~ 


Example Looking for a primitive of f(t) = /1—f?, witht €] — 1, 1[, we may 
consider the function g : ]0, z[— ] — 1, I[ defined as g(x) = cos x, so that 


f(y(x))g'(x) = V1 — cos? x (— sinx) = — sin’ x, 


since sinx > 0 when x € JO, z[. Therefore, we can write 


[vi-®a = — [ sin? xax 


X=arccos t 


1 
— =(x —cosx sinx)+c 
2 x=arccos t 


1 
<5 (arccost —tVv1— P) +c. 


7.8 The Taylor Formula with Integral Form Remainder 181 


7.8 The Taylor Formula with Integral Form Remainder 
Here we have the “Taylor formula with integral form of the remainder.” 


Theorem 7.16 (Taylor Theorem—II) Let x 4 xo be two points of an interval I 
and f : I > Rben +1 times differentiable. Then 


Ome - i FD oe — wy" du, 


where Py(x) is the nth-order Taylor polynomial associated with f at xo. 


Proof Let us first prove by induction that if f is n + 1 times differentiable, then 
the function g,(u) = f+" (u)(« — wu)” is primitivable (here x is fixed). If n = 0, 
then we have that go(u) = f'(u), hence the proposition is true. Assume now that 
the proposition is true for some n € N. Then, if f is n + 2 times differentiable, 


Dy FOP we —uy"t!) = fw ea —w"* -— @t DFMO WE — uw)", 
1.€., 
Snti(u) = (n+ L)gn(u) + Du ft PW — u)"*). 


Since we know that g,, is primitivable, the preceding formula tells us that g,+41 is too, 
since it is the sum of two primitivable functions. We have thus proved the assertion. 

Let us now prove the formula by induction. If n = 0, then, by the Fundamental 
Theorem, 


f(x) = f (xo) + i: f'(u) du = po(x) + x i fOM Wa —u)? du, 
xo " YX0 


hence the formula is true. Assume now that the formula holds true for some n € N, 
and let f ben + 2 times differentiable. Then 


1 
(n+ 1)! 


P(x) = Pasa) = F&) = (pax) + FD Exo) (x = x0)"*") 


f) Gaye ao) 


1 x 
_ (n+1) _ n 7 
=o ee (u)(x —u)" du GD! 


= ( / FD Wax —u)" du ~ fH x0) (x - soy") ; 
nN! \Sxq n+1 


182 7 The Integral 


Integrating by parts (we know that g, and gy+1 are primitivable), 


/ * FO D (uy —u)" du = 
x0 


(x — unt (n+1) on . a al (n+2) 
(-Fae ee] L -S)erwae 


1 1 x 
(n+1) n+1 (n+2) n+1 
——s xy) (x —x = f u(x — du, 
1 tf ( 0)¢ 0) 1 [ : ( )¢ u) 


and substituting, 


f(x) — pnyi(x) = zs (i [ f™ Max = ut! au) 
n!'\n+1 er 


= <a DI [ fw =u)" du. 
! Sx 


Hence, the formula holds also for n + 1, and the proof is complete. a 


7.9 The Cauchy Criterion 


We have already encountered the Cauchy criterion for sequences in complete metric 
spaces and for the limit of functions (Theorem 4.15). It is not surprising that a 
similar criterion holds also for integrability, which can be thought as a kind of 
“limit” of the Riemann sums. 


Theorem 7.17 (Cauchy Criterion) A function f : I > Ris integrable if and only 
if for every € > O there is a gauge 5 : I — R such that, taking two 5-fine tagged 
partitions P, Q of I, we have 


ISCf, P) — SU, Q)| <e. 


Proof Let us first prove the necessary condition. Let f be integrable on J, with 
integral 7, and fix « > 0. Then there is a gauge 6 on J such that, for every 6-fine 
tagged partition P of J, we have 


ISLP)- Sls 5. 


If P and Q are two 5-fine tagged partitions, then we have 


iSCEPI=SHR Oi) <1SPP= FI 7 <5, l= 5 a 


€ 
zt =e. 
2 


7.9 The Cauchy Criterion 183 


Let us now prove the sufficiency. Once the stated condition is assumed, let us choose 
€ = | so that we can find a gauge 6; on J such that 


ISCf,P) — SU, D| <1, 


whenever P and Q are 5|-fine tagged partitions of J. Taking ¢ = 1/2, we can find a 
gauge 52 on J that we can choose so that 62(x) < 61(x) for every x € J, such that 


NIle 


ISCf, P) — Sf, QI < 


whenever P and Q are 62-fine tagged partitions of 7. We can continue this way, 
choosing ¢ = 1/k, with k a positive integer, and find a sequence (6,)x of gauges on 
TI such that, for every x € J, 


51 (x) = b9(x) = +++ = be(X) = Oey 1X) =... , 


and such that 


ISP) — SDI s 
whenever P and Q are 6x-fine tagged partitions of J. 

Let us fix, for every k, a 6,-fine tagged partition Pr of J. We want to show that 
(SCF, Pr))k is a Cauchy sequence of real numbers. Let ¢ > 0 be given. Let us 
choose a positive integer N such that Né > 1. If k, > N and kz > N, assuming, 
for instance, kz > k,, then we have 


1 


1 
as eB 
Gon 


ISCf, Pr) — Sf, Pix) < 


This proves that (S(f, Pre)k is a Cauchy sequence; hence, it has a finite limit, which 
we denote by 7. 

Now we show that 7 is just the integral of f on J. Fix ¢ > 0, let n be a positive 
integer such that ne > 1, and consider the gauge 6 = 4,. For every 6-fine tagged 
partition P of I and for every k > n, we have 


ISCf,P) — Sf, Pl < 


<e. 


slR 


Letting k tend to +00, we have that S(f, Px) tends to 7, and consequently 
IS(f,P)-— Si se. 


The proof is thus completed. | 


184 7 The Integral 


7.10 Integrability on Subintervals 


In this section we will see that if a function is integrable on an interval J = [a, 5], 
then it is also integrable on any of its subintervals. In particular, it is possible to 
consider its integral function. Moreover, we will see that if a function is integrable 
on two contiguous intervals, then it is also integrable on their union. 

In what follows, it will be useful to consider the so-called “tagged subpartitions” 
of the interval J. A tagged subpartition is a set of the type 


& = {&j,loj, Bj): j=1,...,m}, 


where the intervals [a;, 6;] are nonoverlapping, but not necessarily contiguous, 
and §; € [aj;, Bj] for every j = 1,...,m. For a tagged subpartition ©, it is still 
meaningful to consider the associated Riemann sum 


S(f, 5) = > FEB; — 9). 


j=l 


Moreover, given a gauge 6 on J, the tagged subpartition & is 6-fine if, for every j, 
we have 


§j —aj <45(€j) and Bj —§} < $j). 
Let us state the property of “additivity on subintervals.” 


Theorem 7.18 Given three points a < c < b, the function f : [a,b] > R is 
integrable on [a, b] if and only if it is integrable on both [a,c] and {c, b]. In this 


ae 
frafrs fir. 


Proof We denote by f; : [a,c] > Rand f2 : [c, b] — R the two restrictions of f 
to [a, c] and [c, b], respectively. 

Let us first assume that f is integrable on [a, b] and prove that f| is integrable 
on [a, c] by the Cauchy criterion. Fix ¢ > 0; since f is integrable on [a, b], this 
verifies the Cauchy condition, and hence there is a gauge 6 : [a, b] > R such that 


IS(f,P) — Sf, Di <e 


for every two 6-fine tagged partitions P, QO of [a, b]. The restrictions of 6 to [a, c] 
and [c, b] are two gauges 6; : [a,c] > R and 62 : [c, b] — R. Now let P; and O71 
be two 61-fine tagged partitions of [a, c]. Let us fix a 52-fine tagged partition Py of 


7.10 Integrability on Subintervals 185 


f(x) 


[c, b] and consider the tagged partition. P of [a, b] made by P U Py and the tagged 
partition O of [a, b] made by O1 U Pr. It is clear that both P and Q are 6-fine. 
Moreover, we have 


IS(fi, Pr) — SCA, 21) = ISP) — SU DI < ; 


the Cauchy criterion thus applies, so that f is integrable on [a, c]. Analogously it 
can be proved that f is integrable on [c, b]. 

Suppose now that /| is integrable on [a,c] and f2 on [c, b]. Let us then prove 
that f is integrable on [a, b] with integral [ . f+f i f. Once ¢ > 0 is fixed, there 
is a gauge 5; on [a, c] and a gauge 42 on [c, b] such that, for every 5,-fine tagged 
partition Py of [a, c], we have 


é 
S35: 


Cc 
sca. Po = ; I 
a 
and for every 5-fine tagged partition Pp of [c, b], we have 


& 
a. 


b 
sip. Po -| f 


c 


We now define a gauge 6 on [a, b] as follows: 


min [51(x), —| ifa<x<c, 


8(x) = {min {d;(c),d2(c)}  ifx =c, 


min {32(x), << ife<x <b. 


186 7 The Integral 
Let 


P = {(x1, ao, ai]), ..-. ms [am—1, dm))} 


be a 6-fine tagged partition of [a, b]. Notice that, because of the particular choice of 
the gauge 6, there must be a certain 7 for which x7 = c. Hence, we have 


S(f.P) = f(xi(ar — ao) +e + f (7-1) (G71 — 47-2) + 
+f (ce —aj;-1) + fla; —0) + 
+f aj+1) (Gj+1 as aj) +-+++ f(%m)(Qm — Am-1) - 


Let us set 
Pi = {(x1,[a0, a1), ---, (xj-1, [aj—2, 47-1]), (c, [az-1, c])} 
and 
Po = {(c,[e, aj), Cxp4t, Lay, az+i)), +++, &m; [am—1, am)} 
(but in case aj—1 or aj; coincides with c, we will have to take away an element from 


one of the two). Then Py is a 6,-fine tagged partition of [a, c] and Po is a 62-fine 
tagged partition of [c, b], and we have 


S(f.P) = S(fi. Pi) + S(fa, P2) - 


Consequently, 
5 c b Fe c r b 
sum -(f ref f) < (fi Pi - | ]+|s..P0- f f 
a c a c 
ze is € 
— -—-=€6, 
de “aD 
which completes the proof. a 


Example Consider the function f : [0,2] — R defined by 


3. ifx € [0,1], 
fHs {2 

5 ifx e]l,2]. 
Since f is constant on [0, 1] with value 3, it is integrable there, and i f= 3: 
Moreover, on the interval [1, 2] the function f differs from the constant 5 at only 


one point: We have that the function g(x) = f(x) — 5 is zero except for x = 1. 


7.11 R-Integrable and Continuous Functions 187 


As we have shown at the end of Sect. 7.6, g is integrable on [1, 2] with zero integral 
and so, since f(x) = g(x) +5, even f is integrable and 


2 2 2 
/ fxydx = [ eoyax + [ S5dx =0+5=5. 
1 1 1 


In conclusion, 


2 1 2, 
/ fax = f fade + | f@)dx=34+4+5=8. 
0 0 1 


It is easy to see from the theorem just proved that if a function is integrable on 
an interval J, it is still integrable on any subinterval of 7. Moreover, we have the 
following corollary. 


Corollary 7.19 [f f : I > R is integrable, for any three arbitrarily chosen points 


u,v, w in one has 
WwW U WwW 
[r-frefr 
u u Uv 


Proof The case u < v < w follows immediately from the previous theorem. The 
other possible cases are easily obtained using the conventions on the integrals with 
exchanged or equal endpoints. | 


7.11 R-Integrable and Continuous Functions 
Let us introduce an important class of integrable functions. As usual, J = [a, b]. 


Definition 7.20 We say that an integrable function f : J — R is “R-integrable” 
(or “integrable according to Riemann’) if among all possible gauges 6 : J ~ R 
that verify the definition of integrability it is always possible to choose one that is 
constant on J. 


We can immediately see, repeating the proofs, that the set of R-integrable 
functions is a vector subspace of the space of integrable functions. Moreover, the 
following Cauchy criterion holds for R-integrable functions whenever one considers 
only constant gauges. 


Theorem 7.21 A function f : I — R is R-integrable if and only if for every 
€ > O there isaéd > 0 (i.e., a constant gauge 5) such that, taking two 6-fine tagged 


188 7 The Integral 


partitions P, fa) of I, one has 
IS(f,P) — S(f, DI <e. 
We now want to establish the integrability of continuous functions. Indeed, in 
the following two theorems we will prove that they are both R-integrable and 
primitivable. 


To simplify the expressions to come, we will denote by 4(K) the length of a 
bounded interval K. In particular, 


ula, b])=b-a. 
It will be useful, moreover, to set 4(@)= 0. Here is the first theorem. 
Theorem 7.22 Every continuous function f : I > Ris R-integrable. 
Proof Fix e > 0. Since f is continuous on a compact interval, by Heine’s 


Theorem 4.12, it is uniformly continuous there, so that there is a 6 > 0 such that, 
for x and x’ in J, 


Ix—x'|<28 => [f@)-f@)1<——. 
b-a 


We will verify the Cauchy criterion for the R-integrability by considering the 
constant gauge 6 thus found. Let 


P = {(x1, [ao, ai]), -.-, ms [am—1, Am])} 
and 
O = {(%1, [Go, @1]), ..., (Xm, [am—1, GD} 


be two 6-fine tagged partitions of 7. Let us define the intervals (perhaps empty or 
reduced to a single point) 


Tix = [aj-1, aj] 9 [ax-1, a] . 


Then we have 


m m 
aj—aj1=) wx), ay — G1 = > wyx), 


k=1 j=l 


7.11 R-Integrable and Continuous Functions 189 


and, if J; , is nonempty, |x; — xx| < 26. Hence, 


mm mom 


ISGP) — SCF ODI =| DID FO udp) — DOD FGM) 
j=lk=l k=1 j=1 
=|>0 UC IF@,) — FO) ww) 
j=l k=l 


<Vf@) - FEO) MGW) 


j=lk=1 
m m , 
< — wl) =e. 
Oy papain 
j=Hlk=1 
Therefore, the Cauchy criterion applies, and the proof is completed. a 
Here is the second theorem. 
Theorem 7.23 Every continuous function f : [a,b] > R is primitivable. 
Proof Since it is continuous, f is integrable on every subinterval of [a, b], so we 
can consider its integral function i f. Let us prove that it is a primitive of f, ie., 


that if a point xo is taken in [a, b], the derivative of i f at xo coincides with f (xo). 
We first consider the case where xg € Ja, b[ . We want to prove that 


lim t(f r- f°) = F00. 


f(x) 


a Xo Xgth b 


190 7 The Integral 


Equivalently, since 


1 xoth x0 1 fxoth 
+([ r-[" 1)-r00=7 f (FO) — flo) ax, 
a a x0 


we will show that 
1 xoth 
lim — _ dx =0. 
im 7 [re f(x) dx 


Fix ¢ > 0. Since f is continuous in xo, there is a 6 > O such that, for every 
x € [a, b] satisfying |x — xo| < 6, one has | f(x) — f(xo)| < ¢. Taking h such that 
0 < |h| < 6, we distinguish two cases. If 0 < h < 6, then 


1 xoth 1 xoth 
<7 f Ped—foolde <> f seme 
Wah h 


0 x0 


1 xoth 
al (F() — feo) dx 
X0 


on the other hand, if —é < h < 0, then we have 


1 xoth 
al (fx)—f (x0)) dx 
x0 


1 0 1 x0 
<— | If (x)—f (xo)| dx < — edx=e, 
xoth —h 


—h xoth 


and the proof is completed when xo € Ja, b[ . In case x9 = a or x9 = D, we proceed 
analogously, considering the right or the left derivative, respectively. | 


Notice that it is not always possible to find an elementary expression for the 
primitive of a continuous function. As an example, the function f(x) = sin(x2), 
being continuous, is primitivable, but there is no elementary formula defining any 
of its primitives. By “elementary formula” we mean an analytic formula where only 
polynomials, exponentials, logarithms, and trigonometric functions appear. 

Let us now prove that the Dirichlet function D is not R-integrable on any interval 
[a, b] (remember that D(x) is 1 on the rationals and 0 on the irrationals). We will 
show that the Cauchy criterion is not verified. Take 6 > 0 constant, and let a = ag < 
ay < +++ < Gm = b be such that, for every j = 1,...,m, one has aj — aj_; < 6. 
In every interval [aj;-1,a;] we can choose a rational number x; and an irrational 
number x;. The two tagged partitions 


P = {(x1, [ao a1]),..-, ms lam—1, AmI)} 


Q = {(X1, [ao, a1]), .--, Gin, [am—1, Am} 


are 6-fine, and, by the very definition of D, we have 


S(D, P) — S(D, Q) = Y“ID(xj) — D&) aj —4j-1) = Y(@j—aj-1) = ba. 


j=l j=l 


7.12 Two Theorems Involving Limits 191 


Since 6 > 0 was taken arbitrarily, the Cauchy criterion for R-integrability does not 
hold, so that f cannot be R-integrable on [a, b]. 


7.12 Two Theorems Involving Limits 


Let J = [a,b] and f : I — R be a continuous function; recall that the integral 
function oh f is a primitive of f, so it is surely continuous on J. It is then possible 
to define the function VW : C(/, R) — C(/, R) as 


piney = f f. 


Taking f, g € CU, R), for every x € [a, b] one has that 
rena - won| =| f ¢-o|s f it-al 


b 
=a if -ei Sb alt ele: 
a 
whence 


IMA) — Vs)lloo = (6-AIlF — Blo. 


This implies that W is a continuous function. We will use this fact in the following 
two theorems involving limits. 

We consider the situation where a sequence of continuous functions (fy)n 
converges pointwise to a function f, i.e., for every x € J, 


lim f(x) = f(x). 


The question is whether f is integrable on 7, with 


[rain f ty. 
[im se = tie fy 


In other words, we wonder if it is possible to commute the operations of integral and 
limit. 


1.e., whether 


192 7 The Integral 


Example Let us first show that in some cases the answer could be no. Consider the 
functions f, : [0,7] > R, withn = 1, 2,..., defined by 


nsin(nx) if x € [0,4], 


0) otherwise. 


Sn(x) = 


For every x € [0, z] we have lim, f, (x) = 0, whereas 


XT m/n a 
i In(x) dx = / nsin(nx) dx = / sin(t)dt =2. 
0 0 0 


Hence, in this case, 


us a 
/ lim fy = 042 tim [ Hie 
0 n n 0 


In the following theorem, which will be generalized later (Theorem 9.13), the 
answer to the foregoing question is positive, provided we assume the convergence 
to be uniform. 


Theorem 7.24 Let (fi)n be a uniformly convergent sequence in C([a, b], R). Then 


b b 
i lim fy = lim / des 
a n n a 


Proof Let lim, f, = f : I > R. Since the convergence is uniform, we know that 
f € CU, R). Moreover, since V is continuous, we have that lim, V(f,) = Wf), 
1.€., 


lim[Y(fu)1x) = [Y(P)1@&), uniformly in x € [a,b]. 


In particular, taking x = b, 


inf m= fs. 


which is what we wanted to prove. | 


In the second theorem, an analogous question concerning the possibility of 
commuting the operations of derivative and limit is analyzed. 


7.12 Two Theorems Involving Limits 193 


Theorem 7.25 Let xo € I, yo € R, (fn)n be a sequence in GAGE R) and g € 
CC, R) be such that 


lim fn(xo) = yo and lim f’ = g uniformly on 1. 
n n 


Then (fn)n converges uniformly to some function f. Moreover, f € C!(I,R) and 
f' = g. Consequently, we can write 


d 
dx 
Proof Let us define the function f : J > Ras 

x 


foo) = yo + | g(t)dt. 


x0 


Since g is continuous, the function f is differentiable and f’(x) = g(x) for every 
x € 1. In particular, f ¢ C'(/, R). The proof will be completed showing that (fn)n 
converges uniformly to f. 

By the Fundamental Theorem, for every n € N and x € J we can write 


fn) = foo) + f fA0d; 
x0 


Ine) = fn 0) + TY CGI) = [¥fn)1@0) - 


Since lim, f; = g inC(J, R), we have that lim, Y(f/) = Y(g) inC(/, R), ie., 


n 


lim (f,.) 1°) =[W(g)](x), uniformly inx €/. 


Hence, since also lim, fn(xo0) = yo, we have that 
x 
lim fn) = vol (g)1C)—-[¥ (8) 10) = vot | g(t)dt, uniformlyinx e/. 


x0 


We have thus proved that (f;)n converges uniformly to f. | 


194 7 The Integral 


7.13 Integration on Noncompact Intervals 
We begin by considering a function f : [a, b| > R, where b < +00. Assume that 


f is integrable on every compact interval of the type [a,c], with c € Ja, b[ . This 
happens, for instance, when f is continuous on [a, b[. 


Definition 7.26 We say that a function f : [a,b[— R is “integrable” if f is 
integrable on [a, c] for every c € Ja, b[, and the limit 


lim 
c>b- : f 


exists and is finite. In that case, the preceding limit is called the “integral” of f on 
[a, b[ and it is denoted by [” f, or by [” f(x) dx. 


In particular, if b = +-oo, then we will write | an f,or | f(x) dx. 


Examples Let a > 0; it is readily seen that the function f : [a, +oo[ > R, defined 
by f(x) =x, is integrable if and only if a > 1, in which case we have 


Te ae ge 
/ x” al’ 
Consider now the case a < b < +oo. It can be verified that the function f : 
[a, b| > R, defined by f(x) = (b — x)~F, is integrable if and only if B < 1, in 


which case we have 
i _ bay * ay 
(b— a a 1-6 


We also say that the integral converges if the function f is integrable on [a, b[ , 
i.e., when the limit lim,_, ,- if a f exists and is finite. If the limit does not exist, we 
say the integral is undetermined. If it exists and equals +-oo or —oo, we say that the 
integral diverges to +00 or to —oo, respectively. 


7.13 Integration on Noncompact Intervals 195 


It is clear that the convergence of the integral depends solely on the behavior of 
the function “near” the point b. In other words, if the function is modified outside a 
neighborhood of b, the convergence of the integral is by no means compromised. 

Let us now state the Cauchy criterion. 


Theorem 7.27 Let f : [a,b[— R be aes that is integrable on [a, c], for 


every c € Ja, b[ . Then f is integrable on [a, b{ if and only if for every ¢ > O there 
is ac € Ja, b[ such that, for any c’ and c" in [c, bl , we have that 


ae. 


Proof It is a direct consequence of Theorem 4.15, when applied to the function 
F : [a, b| > R defined as F(c) = f* f. | 


From the Cauchy criterion we deduce the following comparison criterion. 
Theorem 7.28 Let f : [a,b[— R be a function that is integrable on [a, c], for 
every c €]a,b[. If there is an integrable function g : [a, b[—> R such that, for 
every x € [a, b[, 

If@)| Sg), 


then f is integrable on [a, b[, too. 


Proof Once € > O is fixed, there is a ¢ € Ja, b[ such that, taking arbitrarily c’, c” in 
[c, b[ , one has that ee g| < «. If, for example, c’ < c”’, since —g < f < g, we 


have 
cl cl’ cl 
= / gs i: f< i g, 
c c c 
cl’ 
i\= 
c’ 


The Cauchy criterion then applies, whence the conclusion. | 


and therefore 


gS. 


Note that it would have been sufficient to assume the inequality | f(x)| < g(x) 
on [c, b[ . As an immediate consequence, we have the following corollary. 


196 7 The Integral 


iain 7.29 Let f : [a,b[— R be a function that is integrable on [a, c] for 
every c € ja, b[ . If | f| is integrable on [a, bl, then f is, too, and 


< fis 


A function satisfying the assumption of the preceding corollary will be said to be 
L-integrable. Let us now state a corollary of the comparison criterion that is often 
used in practice. 


Corollary 7.30 Let f, g : [a, b| > R be two functions with positive values that are 
integrable on [a, c] for every c € Ja, bl. Assume that the limit 


fe hin 2S 
x—>b- g(x) 


exists. Then the following conclusions hold: 

(a) If L €]0,+00[, then f is integrable on [a, b[ if and only if g is. 
(b) If L = Oand g is integrable on [a, b[ , then f is as well. 

(c) If L = +00 and g is not integrable on [a, b[ , then neither f is. 


Proof Case (a). If L €]0, +o00[, then there exists a c € Ja, b[ such that 


e[c,bl => 


- 3L 2 
Sle Pl =e SG) 8G) and. 8G) AiO): 


The conclusion then follows from the comparison criterion. 

Case (b). If L = 0, then there exists ac € Ja, b[ such that, if x € [c, b[, then 
f(x) < g(x), and the comparison criterion applies. 

Case (c). If L = +00, then we reduce this to case (b) by exchanging the roles of 
f and g. | 


Example Consider the function f : [0, +-co[ > R defined by 


fx) sed _ 1, 


7.13 Integration on Noncompact Intervals 197 


As a comparison function, we take 


1 
hae 


It is integrable on [0, +o00[, with 
+00 1 
at —_— i t; c ry 
i 24 dx 2 lim [ate an x]5 5 


Since 


EON aie eee 


1 = 
x>+00 g(x) t>0t_ =i 


we conclude that f is integrable on [0, +-oo[ as well. 


We now consider the case of a function f : Ja, b] ~ R, with a > —oo. There is 
an analogous definition of its integral. 


Definition 7.31 We say that a function f :]a,b] — R is “integrable” if f is 
integrable on [c, b] for every c € Ja, b[ , and the limit 


b 
lim 
coat | f 


exists and is finite. In that case, the preceding limit is called the “integral” of f on 
Ja, b], and it is denoted by f? f or [? f(x) dx. 


Given the function f :]a,b] — R, it is possible to consider the function g : 
[a’, b'[— R, with a’ = —b and b’ = —a, defined by g(x) = f(—x). It is easy to 
see that f is integrable on Ja, b] if and only if g is integrable on [a’, b’[ . In this way, 
we are led back to the previous context. 

We will also define the integral of a function f :]a, b[—> R, with —co <a < 
b < +00, in the following way. 


Definition 7.32 We say that f :]a, b[ > R is “integrable” if, once we fix a point 


p €]a, b[, the function f is integrable on [p, b[ and on Ja, p]. In that case, the 
“integral” of f on Ja, b[ is defined by 


fre fPr+ fir. 


It is easy to verify that the given definition does not depend on the choice of 
p €]a,b[. 


198 7 The Integral 


Examples If a,b € R, one can verify that the function 


1 
[(x — a)(b — x)}? 


is integrable on Ja, b[ if and only if 6 < 1. In this case, it is possible to choose, for 
instance, p = (a+ b)/2. 


f= 


Another case arises when a = —oo and b = +o0. For example, we can easily 
verify that the function f(x) = (x2 + 1)7! is integrable on ] — oo, +o0[. Taking, 
for instance, p = 0, we have 


TO". i : 1 soe * 
—s— dx = —— dx + —~—_dx =. 
[= x +1 " is a i: x2+1 ac 


Another case that might be encountered in the applications is when a function 
happens not to be defined in an interior point of an interval. 


Definition 7.33 Given a < q < b, we say that a function f : [a,b] \ {q} > Ris 
integrable if f is integrable on both [a, g[ and ]q, b]. In that case, we set 


[r= fire fir 


For example, if a < 0 < b, then the function f(x) = \|x|/x is integrable on 
[a, b] \ {O}, and 


ee 


On the other hand, the function f(x) = 1/x is not integrable on [—1, 1] \ {0}, even 
if the fact that f is odd caused us to say that its integral was equal to zero. However, 
in that case, some important properties of the integral would be lost, for example, 
the additivity on subintervals. 


b 
1 
aix+ | — dx =2Vb - 2-4. 
0 Vx 


7.14 Functions with Vector Values 


We now consider a function f : [a,b] > IR with vector values. As usual, we can 
write 


f(x) = (fi@),---5 fu@)), 


7.14 Functions with Vector Values 199 


where the functions f; : [a,b] — R are the components of f. We say that f is 
integrable whenever all the components are integrable functions, and in that case 
we can define the integral of f as 


b b b 
[ teoax=(f fiat. f futddt) 


The integral is thus a vector in RY. 
A particular case is encountered when f : [a, b] > C. Writing 


f(x) = fi@) + ifa@), 


we will have 
b b b 
[ seoar= | ficddx +i f f2(x) dx. 


Theorem 7.34 Assume that both f : [a,b] > RW and ||f\| : [a,b] > R are 
integrable and a < b. Then 


b b 
| [ feoas| sf itconas. 
a a 
Proof Set v = ce f(x) dx, ie, v = (VY,..., vn), with vg = pe f(x) dx for 
k=1,...,N.Ifv = 0, then the statement surely holds. Assume now that v 4 0. 


Then, using the Schwarz inequality, 
N N b N b bp N 

we? = Dk= wf icoar= > [anode =f Yusicoas 
k=1 k=1 a k=1°4 @ k=l 


b b b 
=) v- fede | I| | Ivconde = ty f Il Fx)ll dx , 


a 


whence 


b 
lel < | If @llax, 


which is what we wanted to prove. | 


Now let F : [a,b] > R% be a function whose components Fy : [a,b] > R are 
differentiable. In this case we say that F is differentiable, and, writing 


F(x) = (Fi@),..., Fv@)), 


200 7 The Integral 


we can define its derivative at some xo € [a, b] as 


F —F 
Pe ee ed 
x>x9 =X — XQ 
. Fy(x) — Fi(xo) . Fry(x) — Fn (xo) 
=( lim —————_.,....., lim ——————_- 
xX—>Xx0 Xx — x9 xX>X0 xX — XO 


= (Fi (x0), -.-, Fy (%o))- 
Here is a version of the Fundamental Theorem in this context. 


Theorem 7.35 (Fundamental Theorem—III) /f F : [a,b] — R% is differen- 
tiable, then F’ : [a,b] > RN is integrable, and 


b 
/ F'(x) dx = F(b) — F(a). 


Proof Since each component Fx : [a, b] > R is differentiable, by the Fundamental 
Theorem we know that the derivatives F, H : [a, b] > R are integrable and 


b 
[ dr = no -h@, foreveryk =1,...,N. 


a 


Then F’ is integrable, and 


b b b 
/ F' (x) dx =(/ Fy (x) dx, aif Fy) dx) 


= (Fi(b) — Fi(a),... , Fu(b) — Fn(@)) 
= (Fi(d), Ge 9 Fy(b)) _ (Fi(a), boats Fy (a)) 
= F(b)— F(a), 


thereby completing the proof. | 


Part Ill 


Further Developments 


Check for 
updates 


Numerical Series and Series of Functions 


8.1 Introduction and First Properties 


Let V be a normed vector space. Given a sequence (ax)x in V, the associated 
“series” is the sequence (s,,), defined by 


so = 40, 
Ss} =dag+a, 


§2=ag+a,+a2, 


Sn = 0 +a, + a2 +++: +4n, 


The element a; is called the “kth term” of the series, whereas s, = ye. ax 1s said 
to be the “nth partial sum” of the series. Whenever (s,,), has a limit in V, we say 
that the series “converges.” In that case, the limit § = lim,, 5, is said to be the “sum 
of the series,” and we will write 


n CO 
(Soa) = Soa 
k=0 k=0 
and sometimes we will also use the notation 


S=aypta,t+ag+-:-+a@t... 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 203 
A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978- 3-03 1-23713-3_8 


204 8 Numerical Series and Series of Functions 


However, by an abuse of notation, the series (s,), itself is often denoted by the same 
symbols, 


(oe) 
either ae or dagtajtant+-::+a,+.... 
k=0 


Sometimes, for brevity’s sake, we will simply write }°; ak. 
Let us analyze three examples, in all of which V = R. 


Example I For a € R, the “geometric series” 
1 ee a? ea ps ae eg” hs, 5 
has as its kth term a, = a. If a £ 1, the nth partial sum is 
attl_|y 


n 
k 
a Oar 
k=0 


whereas if a = 1, then we have s, = n + 1. Hence, the series converges if and only 
if |w| < 1, in which case its sum is 


1 


eo) 
Sata. 

l-a 
k=0 


Notice that if a > 1, then lim, s, = +-oo, whereas if a < —1, then the sequence 
(Sn)n has no limit since lim inf, s,, = —oo and lim sup, s, = +00. 


In general, if the sequence (s,), has no limit, then we say that the series is 
“undetermined.” On the other hand, for real valued series we say that 


— The series “diverges to +00” if limy 5, = +00. 
— The series “diverges to —0o0” if limy 5, = —oo. 


Example 2 The series 


1 1 1 
To 38. 6 Gee 


has as its kth term a, = It is a “telescopic series”: 


1 1 Fr 
n+1 n+2 _ 


1 
(K+ 1) (k+2)* 


(-2)-8) Gaon 


8.1 Introduction and First Properties 205 


Hence, simplifying, 


1 


ae war 


leading to lim, s, = 1. We have thus proved that the series converges and that its 
sum is equal to 1. We can then write 


oe 1 
2 EE DEFD 


In the preceding example, one could use different notations in the sum like, e.g., 
k=1 bead) 


or variants of it; for example, the letter “k” could be replaced by any other, so that, 
e.g., a sGnD = |. These remarks apply indeed to all series. 


Example 3 The “harmonic series” 


= eee groan eae 
BA n+l 


has as its kth term ay = ae It diverges to +00; we can see this by writing it as 


Ve = st — : st; ae : Bis : 
3 1 7 8 
+E yt ok 
9 10 11 12 13 14 =«=15~— «16 eae 
gathering together the first 2 of its terms, then 4, then 8, then 16, and so on, doubling 
their number oO time. It is easy to see that the sums in the parentheses are all 


greater than 4 . Hence, the sequence of partial sums must have the limit +-oo. We 
can then sitite 


It must be said that the explicit computation of the sum of a series is a rare event. 
Very often we will already be satisfied when proving that a series converges or not. 


206 8 Numerical Series and Series of Functions 


It is important to notice that the convergence of a series is not compromised if 
only a finite number of its terms are modified. Indeed, if the series converges, we 
can change, add, or delete a finite number of initial terms, and the new series thus 
obtained will still converge. In contrast, if the series does not converge, because 
either it is undetermined or diverges to oo, the same will be true of the modified 
series. 


Theorem 8.1 Jf a series }°,. ax converges, then 
lima, =0. 
n 
Proof Let lim, s, = S € V. Then also lim, sy; = S, and hence 
lima, = lim(sy, — s,-1) = lims, — lims,_1 = S—S=0, 
n n n n 
which is what we wanted to prove. | 


Let us study the behavior of series with respect to the sum and to the product by 
a scalar. 


Theorem 8.2 Assume that the two series )\, az and )~, by converge with sums A 
and B, respectively. Then the series )~,(ax + bx) also converges, and its sum is 
A+ B. Moreover, for any fixed a € R, the series )°,(aax) also converges, and its 
sum is aA. We will briefly write 


[o,@) [o,@) [o,@) [o.@) [o,@) 
Sar + be) = Yo art Yo bx, Y\(aax) = 0 Doak. 
k=0 k=0 k=0 k=0 k=0 


Proof Let sn = Yp—p a and s), = )-y_o bg. Then 


n n 
Sn +5, = Dae + bn), Sn = (a), 
k=0 k=0 


and the result follows, passing to the limits. | 
Let us see how the Cauchy criterion adapts to series in Banach spaces. 


Theorem 8.3 Jf V is a Banach space, the series )°, a, converges if and only if 


n 
Ve>O Wi: n>m>n > So ak <6. 
k=m+1 


8.1 Introduction and First Properties 207 


Proof Since V is complete, the sequence (s,), has a limit in V if and only if it is a 
Cauchy sequence, i.e., 


Ve>O dn: [m>nandn=>n] > |S —-Smll <e.- 


Now it is not restrictive to take n > m, and if we substitute s, = yo a, and 
Sm = YL ak, the conclusion follows. |_| 


We now state a useful convergence criterion. 


Theorem 8.4 [f V is a Banach space and the series >, \|ax|| converges, then the 
series >, ax also converges. 


In that case we say that the series }°, ag “converges in norm,” unless V coincides 
with either R or C, in which cases we say that the series “converges absolutely.” 


Proof We assume that the series )~; ||ax|| converges. Let e > 0 be fixed. By the 
Cauchy criterion, there exists a € N such that 


n 


n>m>Ai > Y~ flail <e. 


k=m+1 
Since 
n n 
d> al < do lal, 

k=m+1 k=m+1 

we have that 
n 
n>m>i > Ye ae Se 
k=m+1 

and the conclusion follows, using the Cauchy criterion again. a 


The convergence in norm of a series thus reinforces the interest in examining the 
series of positive real numbers. 


208 8 Numerical Series and Series of Functions 


8.2 Series of Real Numbers 


In this section we only consider series with terms ax in R. 

If for every k one has that a, => 0, then the sequence (s,), is increasing, hence 
it has a limit, and we have only two possibilities: The series either converges or it 
diverges at +00. 

The following comparison criterion will be very useful. 


Theorem 8.5 Let 5°, az and )-, by be two series for which 
FkKEN: k>k > O<a <b. 
Then 


(a) If ¥°, be converges, then ~;, ax also converges. 
(b) If >>, ax diverges, then >", by also diverges. 


Proof We define 
Sn =A tart+ant-::t+an, 8, =botbi+bot+---+hn. 


By previous considerations, we can modify a finite number of terms in the two series 
and assume without loss of generality that 0 < ax < by for every k. Then the two 
sequences (s;)n and (s/,)n are increasing, and sy < s/, for every n. Consequently, the 
limits S = limy sp and S’ = lim, s/, exist, and S < S’ < +00. If >>, by converges, 
then S’ € R, so also S € R, meaning )°;, ax converges. If }>;, ax diverges, then 
S = +00, so also S’ = +00, meaning )°, by diverges. a 


Example The series 


Here ica teeta eae 
42 +1207" 


converges. This can be proved by comparing it with the series 


1+ : + : + ; tert : + 
1- -3. 3- n-(n+1) 0’ 


which is a slight modification of the one already treated earlier in Example 2. All 
the terms of the first series are smaller than or equal to the corresponding ones of 
the second series, which converges. 


As a first corollary, we have the asymptotic comparison criterion. 


8.2 Series of Real Numbers 209 


Corollary 8.6 Let )>, ax and )~, by be two series with positive terms, for which 
the limit 


¢ = lim 
k be 
exists. We have three cases: 
(a) € €]0, +00[; the two series either both converge or both diverge. 
(b) €=0; if ¥°, be converges, then )~, ax also converges. 


(c) £= +00; if )°, be diverges, then >, ax also diverges. 


Proof Case (a). If £€ € ]0, +o00[, then there exists a k such that 
1.€., 


The conclusion then follows from the comparison criterion. 

Case (b). If £ = 0, then there exists a k such that if k > k, then ay < bx, and the 
comparison criterion applies. 

Case (c). If £ = +00 we have the analogue of Case 2 with the roles of ax and by 
interchanged. | 


The following corollary provides us with the “root test.” 
Corollary 8.7 Let }°, ax be a series with nonnegative terms. If 
lim sup “/ag < 1, 
k 
then the series converges. 


Proof Set £ = lim sup, </ax, and let a € ]é, 1[ be an arbitrarily fixed number. Then 
there exists a k such that 


210 8 Numerical Series and Series of Functions 


The conclusion follows by comparison with the geometric series )>; a, which 
converges, since 0 < a < 1. a 


Recalling Proposition 3.32, as an immediate consequence we have the “ratio 
test.” 


Corollary 8.8 Let >>, ax be a series with positive terms. If 


ak 


lim sup ae 1, 


k ak 
then the series converges. 


We now present the condensation criterion, which we already implicitly used 
earlier when dealing with the harmonic series of Example 3. 


Theorem 8.9 Let (ax)x be a decreasing sequence of nonnegative numbers. Then 
the two series 


foe) (oe) 
Sa, Steg 
k=0 k=0 


either both converge or both diverge. 


Proof For simplicity, we delete the first term ao from the first series. Let the series 
>; 2*axx converge. Then 
a, + (a2 + 43) < a, + 2a2, 
ay + (az + a3) + (ag +45 + a6 +47) < ay + 2a2 + 4a4, 
ay + (az + a3) + (a4 + a5 + a6 + a7) + 
+(ag + dg + ajo + a1 + 12 +413 +414 + 45) < 
< a, + 2a2 + 4a4 + 8ag, 


leading to the inequality 


gnt+l_y 


n 
>» ak < =. Kank , foreverynéeN. 
k=1 k=0 


By comparison, since )°, ase converges, then )°, ax also converges. 


8.2 Series of Real Numbers 211 


Assume now, conversely, that the series }°, az converges. Then 


a, + 2a2 < 2(ajy +42), 
a, + 2a2 + 4aq < 2(a) + a2 + (a3 + a4)), 
a + 2az + 4a4 + 8ag < 2(a, + a2 + (43 + 44) + (a5 + a6 + a7 + g)), 


leading to 
n 2h 
> Dis < x 2ax, foreveryneEN. 
k=0 k=1 
By comparison, since }°, 2a; converges, then }°, Wax also converges. B 


Example I Let us consider the series 
arr 
kB ’ 
k=1 


where 6 > 0 is a fixed real number. The sequence (ax)x, with ax = 1/ kB, is 
decreasing. The “condensed series” 


lee) lee) 1 lee) 

Keihece k ne 1-Byk 
Stare 2 Gi (2!-B) 
k=1 k=1 k=1 


is a geometric series of the type 7, a*, with a = 2'~8. It converges if and only if 
|a| < 1; hence, 


— 1 
Die converges © B>1. 


Example 2 Let us now examine the series 


oe 
B 
Pm k(nk) 


212 8 Numerical Series and Series of Functions 


for some 6 > 0. By some use of differential calculus, it is rather easy to see that the 
sequence (ax)x, with ag = 1/k(In k)F, is decreasing. The “condensed series” is 


1 =| 
yom k= > 2 Fan25F 22 ~ Cn ‘Gi De Ke 
Looking back at the previous example, we conclude that 
— 1 
dX kan converges > B>1. 


Till now in this section we have only considered series with nonnegative terms. 
Now we shift to series having alternating signs. Consider a series of the type 


ag — a) +a, —a34+---+(-1)"an+..., 
where all ax are positive. What follows is the Leibniz criterion. 
Theorem 8.10 /f (ax)x is a decreasing sequence of positive numbers and 


lima, = 0, 
k 


then the series )°,.(— 1)kax converges. 
Proof Let 
Sn = 9 — a) +a2—43+---+ (-1)"an, 


and consider the sequence (s,,), of partial sums. We divide it into two subsequences, 
one with even indices and the other one with odd indices. Since (ax), 1s positive and 
decreasing, we see that 


Hence, the subsequence (S24 1)m, the one with odd indices, is increasing and 
bounded from above, whereas the subsequence (2, )m, the one having even indices, 
is decreasing and bounded from below. Then both subsequences have a finite limit, 
and we can write 


lim S241 = Ly 7 lim sz = Ly . 
m m 


8.3 Series of Complex Numbers 213 
On the other hand, 

ly-f,= lim(s2m — Som41) = lim dm+1 = 0, 
hence £; = £2. Because the two subsequences (s2+1)m and (S2m)m have the same 
limit, we can be sure that the sequence (s,,), will have the same limit. | 
8.3. Series of Complex Numbers 


When we consider a series )~ ; ak Whose terms are complex numbers ag = x4 + iyg, 
where xz = t(ax) and y, = 3(ax), we can write its partial sums as 


n n n n 
Sn = Soak _ xen + ik) = be: +i yoy =On+iT, 
k=0 k=0 k=0 k=0 


where o, = N(s,) and t, = S(s,). We thus have a sequence (on, T,)n in R?. 
Recalling that such a sequence has a limit in R? if and only if both its components 
have a limit in R, we obtain the following statement. 


Theorem 8.11 Jf ay = xx + iyx, with xz and yx being real numbers, the series 
>>; ak converges if and only if both series )~,, xx and )~, yx converge. In that case, 


[o,@) [o,@) [o.@) 
Soak = Sox tid) ye: 
k=0 k=0 k=0 


Example Let us consider the series 


oa Ee fees eRe ane Das ety eae TS eke 
rr ey es ee a n+1 


The real part is 


eae eae aS 
gn SS ge 


whereas the imaginary part is 


1 1 


: a : =F 

pe Sear eae: ea 

Both of them converge by the Leibniz criterion on series with alternating signs, so 
the given series also converges. 


214 8 Numerical Series and Series of Functions 


Note that, in the previous example, the series does not converge absolutely since 


[o,e) [o,e) 1 


Yale py ay 


is the harmonic series, which we know to be divergent. 
We now define the “Cauchy product” of two series }°7°. ax and “72.9 de. It is 
the series 


¥ (Las ‘) 


k=0 


However, some care is needed concerning its convergence. Indeed, it is not true 
in general that if the two series converge, then their Cauchy product series also 
converges. The following theorem states that this will be true if at least one of them 
converges absolutely. 


Theorem 8.12 (Mertens’ Theorem) Assume that the series )~,ax and Y°, by 
converge, with sums A and B, respectively. If at least one of them converges 


absolutely, then their Cauchy product series converges with sum AB. 


Proof To fix ideas, let bya. ax converge absolutely, and set A= i |ax|. We 
denote by 


k 
Ck = ye a— jb; 
j=0 


the Ath term of the Cauchy product series. Let 


n n n 
/ Mu 
=, des = De 5 ce 
k=0 k=0 k=0 
Moreover, let r/, = B — s/. Then 


s) = agbo + (aybo + agb1) + +++ + (anbo + an—1b1 + +++ + atbn—1 + aobn) 
= aos, + ais,_y t-++ + an—18) + an5p 
= ao(B —1r,) + a\(B = ry_y) +++ + an—-1(B — 1) + an(B — 19) 


=5,B-— (aor’, + air, ; 5 anit; + Ant) . 


8.4 Series of Functions 215 


Since lim s, B = AB, the proof will be completed if 
n 
lim (aor, + arcs Se Gn-ir; + Gn) =0. 
n 
Let e > O be fixed. Since lim i = 0, there exists an, such that 
n 


wen =: [eee 


Let us set R = max{|r‘,| :n € N}. By the Cauchy criterion, there exists an2 > n1 
such that 


n>ny => — |dn—ay4il + lan—ayj42) +---+ lanl <e. 
Then, ifn > no, 


laor), + air, +++ + ani", + anrol < 
n n—-1 1 0 


IA 


aol Iral + +++ + lan—ay | Wry] + lana 4a IR, 11 +--+ + lanl Irol 


IA 


€(|ao| + +++ + |an—ay|) + R(ldn—ay il +++ + lanl) 
cA+Re=(A+R)e, 


IA 


thereby completing the proof. a 


8.4 Series of Functions 


Let E be a metric space and F a normed vector space. If we have a sequence of 
functions f, : E — F, for any x € E we can ask ourselves whether or not the 
series )°, f(x) converges. If there is a subset U C E anda function f : U > F 
such that 


+> fe(x) = f(x), foreveryx eU, 


k=0 


we Say that the series “converges pointwise” to f on U; this happens when, setting 
Sn(X) = Yh fr(x), the sequence (s,)n converges pointwise to f on U. We say 
that the series “converges uniformly” to f on U if the convergence of (s,), to f is 
uniform on U, meaning 


Yo file) — fx) 


k=0 


Ve>O AneN: VWxereU nena <€é. 


216 8 Numerical Series and Series of Functions 


Theorem 8.13 Jf every function fy : E — F is continuous on U C E and the 
series ro fk converges uniformly to f on U, then f is also continuous on U. 


Proof It is a direct consequence of Theorem 4.17. | 


Theorem 8.14 Let E be compact, F a Banach space, and all functions fy: E > F 
continuous. If the series ¥ Po || fxlloo converges, then the series ) po fk converges 
uniformly to a continuous function on E. 


Proof We know that V = C(E, F) is a Banach space, and )°?°9 fx is a series 
in V that converges in norm. Then it converges in V, meaning that it converges 
uniformly. | 


Example Let E = [a,b] C R and F = R, and let us consider the series 


[ee 


1 
> QB sin(e**—! + arctan(x? + Vk)). 
k=1 


We will examine the series of the norms in C([a, b], R). We have that 


1 


1 
sup | | BZ sin(e***—! + arctan(x? + ve :x ela, oi} < ia’ 


and the series )°, z converges. Then the series converges in norm and, hence, 
uniformly. 


We now adapt the two theorems in Sect. 7.12 to the context of series. 


Theorem 8.15 Let (fi)% be a sequence in C([a, b], R) such that the series >, fk 
is uniformly convergent. Then 


b & asd b 
‘ yfiar= >> | fa(t)dt. 
@  k=0 k=0 4 


Proof It is a direct consequence of Theorem 7.24 when applied to the sequence of 
functions (S))n- | 


Theorem 8.16 Let (fi)x be a sequence in C'({a, b],R). Assume that the series 
> Se and the series of the derivatives )>;, fy converge uniformly to some functions 
f : [a,b] > Rand g : [a,b] > R, respectively. Then f is of the class C', and 


8.4 Series of Functions 217 


f' = g. Consequently, we can write 


a =< a 
Fz df= DF Fhe). 
k=0 k=0 


Proof Consider the sequence of partial sums sy, = ye Sx. Then (Sp)n is in 
c!(7,R), and Ss, = peo f¢- By assumption, lim, s, = f and lim, s), = g, 
uniformly on J. Hence, by Theorem 7.25, it must be that f € C!(/,R) and 
f'=8. a 


Iterating the same argument, we can easily generalize the preceding theorem. 


Theorem 8.17 Let (fx)x be a sequence in C™ (a, b], R). Assume that the series 
te pee bee ne 
k k k k 


converge uniformly on [a, b] to some functions f, g1, 82, ..., 8m, respectively. Then 
f is of the class C™, and 


f' =21, fl = 82, +s F™) = ge. 


8.4.1 Power Series 


An important example of a series of functions is provided by the “power series” 


[ee 


(PS)c Yi az", 


k=0 


whose terms are the functions f; : C > C defined as fx(z) = axz*, for some given 
coefficients a; € C. Let us first analyze the pointwise convergence. 


Theorem 8.18 Setting 


L =limsup W/|ax|, 
k 


218 8 Numerical Series and Series of Functions 


we have the following possibilities: 


(a) If L = +o, then the series (PS)c converges only for z = 0. 
(b) If L = 0, then the series (PS)c converges for every z € C. 

: converges if |z| < t ; 
(c) If L €]0, +00[, then the series (PS)c 1 
does not converge if |z| > 7. 


Proof If L = +oo and z #0, then ¥/Jax| > a for infinitely many k, hence 


lagz*| = (Wlag| \z|)* >1,  forinfinitely many k. 


If the series were to converge, then it should be lim, axz* = 0, but this is not so. 
Hence, if L = +00, then the series converges only for z = 0. 
If L = 0, then for any z € C we have that 


lim sup ¥/ lagz*| = |z| lim sup WJax| = 0, 
k k 


so, by the root test, the series converges absolutely. 
Assume now L € ]0, +00[. If |z| < t then 


lim sup V/ lagz*| = |z| lim sup W/Jag| = |z| L < 1, 
k k 


and, by the root test, the series converges absolutely. In contrast, if |z| > + , 1.€., 
L> ae then 4/Jax] > a for infinitely many k, and hence 


lagz*| = (W/laz||z)* > 1, — for infinitely many k . 


If the series were to converge, then we would have lim, agz* = 0, but this is not the 
case. | 


We define the “convergence radius” r of the series (PS)c as follows: 


0 ifL=+00, 


ra-tto ifL=0, 
1 
= if L €]0, 4 
7 1 ]0, +co[ 


If r > O, we will say that B(O,r) is the “convergence disk” of the series. We 
emphasize that it is an open ball. If r = +-oo, then we set B(O,r) = C. 


8.4 Series of Functions 219 


If r > 0, then the series (PS)c converges pointwise in B(O, 7). However, this 
convergence could not be uniform. We now see that the convergence will be uniform 
on any smaller disk. 


Theorem 8.19 Assume r > 0. Then for every p € }0, r[ the series (PS)c converges 
in norm, hence uniformly, on B(O, p). 


Proof Let p €]0,r[ be fixed. Then 
sup{laxz*| : |zl <p} = laklo*, 


and since 
lim sup ¥/ |ax|o* = p lim sup Way] = pL <rL=1, 
k k 


then, by the root test, the series }°, |ax|o* converges. We have thus seen that the 
series (PS)c converges in norm on B(0, p). | 


Corollary 8.20 [fr > 0, and f : B(O,r) > C is defined as 


f@=> axe", 
k=0 


then f is continuous on B(O,r). 


Proof From the previous theorem, for every p € ]0, r[ the convergence is uniform 
on B(0, p); hence, f is continuous on B(O, p). Since p €]0,r[ is arbitrary, f is 
thus continuous at every point of B(0, r). fi 


Remark 8.21 The above theory can be easily generalized to series of the type 


[o,@) 

k 
>| ax(z — 20) 
k=0 


for some fixed point zo € C. (Indeed, the change of variables u = z — zg leads back 
to the previously considered case.) The convergence disk in this case is B(zo,r) = 
{zEC:|z—z| <r}. 


220 8 Numerical Series and Series of Functions 


8.4.2. The Complex Exponential Function 


Let us now examine, for every z € C, the series 


k 


ane 4 


It is a power series, which converges absolutely, since 


We can then interpret this function F as an extension of the exponential function 
to the complex plane C. For this reason, we will call F the “complex exponential 
function” and write either exp(z) or e* instead of F(z). 


Theorem 8.22 For every z, and z2 in C we have that 


exp(z1 + 22) = exp(z1) + exp(z2). 


ok ak 
Proof The series )°?- 7 and )°?29 | converge absolutely, and their sums are 
exp(z,) and exp(z2), respectively. Then, by Mertens’ Theorem 8.12, the Cauchy 
product series converges, and its sum is exp(z1) exp(z2). On the other hand, the 


Cauchy product series is 


o , kk kj ij oo k oo k 

z Z 1 K\ pj j (z1 + 22) 
Y (Leta) “Lae ow k 
k=0 * j=0 k=0 j=0 k=0 


and its sum is exp(z1 + Z2), whence the conclusion. | 


8.4 Series of Functions 221 
Writing z = x + iy, we obtain 


exp(x + iy) = exp(x) exp(iy). 


Moreover, 


exp(iy) = oe oy = lim gn(y), 
k= 


We thus have that 


anv) = PCy) +i gO), 


where 


(1) y? ‘ Pe 
= 1 a ee ta Grae 1 m 
In (Y) Tim aaa a 


if either n = 2m orn = 2m + 1, whereas 
3 po) Sat 2m+1 
Dewy _ y y- x m_» 
=y-LH +42 -—- cot (=) = 
Gn (Y= y ai + si Ta (-1) Qm 4D! 


if either n = 2m + 1 orn = 2m + 2. Since 
tim an) = cosy, Tim, g,)(y) = sin y, 
we conclude that 
limgn(y) = (lim 4{?(y), lim a'(»)) = (cos y, sin y), 
hence 
et) = e*(cosy +isiny). 


This is the Euler formula. 


222 8 Numerical Series and Series of Functions 


It can be easily verified that 


elf 4 emit eit — eo it 
cost = ————_., sin t = ————— 
2 2i 
These formulas can be used to extend the functions cos and sin to the complex field 
by simply taking t € C. The hyperbolic functions can also be extended to C by the 
formulas 


e& + e< . e — e< 
cosh z = ————,, sinh z = 
2 2 
Note that 
cost = cosh(if) , sint = —i sinh(it). 


The analogies between the trigonometric and the hyperbolic functions are now 
surely better understood. 
8.4.3 Taylor Series 


Let us now consider the power series 
[o,@) 
k 
(PS)r ax 
k=0 


where x € R and all coefficients ax are also real numbers. We are thus considering 


the series }>, fx, where the functions f; : R > R are defined by fx(x) = arx*, 


Hence, if r > 0, then the convergence disk B(0,r) is now reduced the interval 
]-—r,r[, and if r = +0, then it is the whole real line R. In these cases, we might 
wonder whether the sum of the series (PS)p was differentiable on ] — r, r[. 


Theorem 8.23 Letr > 0 be the convergence radius of the series (PS), and let 
[o,@) 
$QVSD Gea for everyx €]—r,r[. 
k=0 


Then the series 


CO 
ys kazx*! 
k=1 


8.4 Series of Functions 223 


has the same convergence radius r. Moreover, the function f :] —r,r[—> R is 
differentiable, and 


[o,@) 
fi@= So kagxkt, for everyx €]—r,r[. 
k=1 


Proof Since 


lim sup W/ |kax| = _ Vk lim sup W |ax| = lim sup ¥/|ax| , 
k k k 


we see that the convergence radius for the series )~, kazx*—! is equal to r. We can 
then define the new function g :] —r,r[— Ras g(x) = par kayx*—!. For any 
fixed p € ]0,r[ we know that the convergence of the series is uniform on [—p, p]. 
Setting fx(x) = agx*, we have that f = >), fe and g = >, ff, and 
the convergence is uniform on [—p, p]. By Theorem 8.16, f is differentiable on 
[—-p, p] and f’(x) = g(x) for every x € [—p, p]. The conclusion follows since 
p €]0, r[ is arbitrary. | 


Iterating the same argument and making use of Theorem 8.17 we easily obtain 
the following generalization. 


Theorem 8.24 Letr > 0 be the convergence radius of the series (PS)p, and let 
[o.@) 
Gs) ae, for everyx €]—r,r[. 
k=0 


Then the series 


CO CO CO 
kagx*—! , x k(k — lagx*-? , a: == Dan *s. ws. 4 
k=1 k=2 k=3 


S> (k= Ik = 2)- (kK =m + Dag", 


k=m 


all have the same convergence radius r. Moreover, the function f :|—r,r[—> Ris 
infinitely differentiable and, for every positive integer j, 


£2) = Do kk =k = 2)--- = f+ Dax! , for everyx €]—r,rf. 
kj 


224 8 Numerical Series and Series of Functions 


Note now that, taking x = 0 in the previous formula, we obtain 
fPO) = jla; 


for every j € N (recalling that f© = f). Then 
[o,@) [o,@) 1 
fx) = = ax" = aL FPO". 


This is the “Taylor series” associated with the function f at x9 = 0. We have thus 
proved that any power series with positive convergence radius r defines a function 
f that is analytic on] —r,7r[. 


Remark 8.25 Referring to Remark 8.21, we can also extend the foregoing consid- 
erations, made for the series (P'S), to power series of the type 


[o@) 

k 
x ak (x — x0) 
k=0 


for some fixed point x9. If the convergence radius r is positive, then the convergence 
disk is ]xo — r, xo + r[, and the function f :]x9 — r, xo + r[— R defined by the 
sum of the series can be expressed by 


al 
FX) =D) Gf Go — x0)", 


k=0 


ie., f(x) = lim, py(x), where p,(x) is the Taylor polynomial defined by (6.3). 


8.4.4 Fourier Series 


Let us now consider the “trigonometric polynomials” having some fixed period T > 
0. They are defined as 


n 
20k 20k 
fa(t) =co+ dX (« cos (=) + by sin (=) , 


where co, ax, and bg are some real constants. We are interested in examining the 
convergence of the sequence of functions (fy) n- 


8.4 Series of Functions 225 
Theorem 8.26 /f there exists a function f : [0,7] — R such that 
lim fn(t) = f(t), — uniformly on [0, T], 
n 


then necessarily 


1 T 
=— t)dt, 
== f(t) 
a7" Qnk 
a = F ( f(t) cos (=) dt, 
af. 2 
bp = F f(@) sin (=) dt 


Proof By Theorem 8.15, 


[ rw a 00 (= ae (=)) ‘ 
; f@dt = : cat Yo (a1 00s air) x sin { 


k=1 


i Ink _ (2nk 
=aT +) > ; Ax COS +t + by sin =F? dt =col, 
k=1 


whence the formula for co. Analogously, for any integer j > 1, 


J 2nj es 
‘. f(t) cos (44) t= 
T oo . 
= [ (+> (« cos (=) + by sin (=))) cos (F*) dt 


ee) 


T . 
20k 20k 2 
— / ax COS Sy + by sin ale cos es dt. 
Jo T i T 


On the other hand, integrating by parts twice, we see that, for any positive integer 


k# j, 


a nf 2K 2nj . nk Qnj 
sin | ——t ] cos | ——t } dt =0, cos {| ——t ] cos | ——t ] dt = 0, 
0 T T 0 T T 


226 8 Numerical Series and Series of Functions 


whereas, if k = j, then 


Hence, 


q Qnj T 
is f(D) cos (Fr) dt = 5a, 


yielding the formula for a;. Similarly one can see that 


is 2nj Te 
t)sin{ —t] dt=—b,;, 
[ reosin( Fr) ar = 25, 


providing the formula for b;. a 


For any given continuous function f : [0,7] — R, we define its “Fourier 
coefficients” 


= az f(t) cos (Fr) ar, — m=z fe f@) sin (Fr) ar 


and its “Fourier series” 
lee) 
20k 20k 
> + pe (« COs (=) + by sin (>.)) , 


The problem is: Does this series converge for every t € [0, T]? 

The answer is, in general, no: There exist continuous functions f : [0,7] ~ R 
for which the Fourier series fails to converge at some points t € [0, T]. However, 
there are many ways to overcome this difficulty. We will very briefly review some 
of them. 

Let us define the partial sums of the Fourier series as 


n 
20k 20k 
Sr(t) = S + » (« cos (=) + bx sin (=) . 


We can now also define the “Cesaro means” 
1 
ont) = —— [fo + flO +--+ fr] 
n+1 


so as to be able to state, without proof, the following theorem. 


8.4 Series of Functions 227 


Theorem 8.27 (Fejer Theorem) /f f : R — R is continuous and T -periodic, 
then 


limon(t) =f) 
uniformly for every t € R. 

Here is a direct consequence. 
Corollary 8.28 Let f, f : R > R be continuous and T -periodic functions. If the 
respective Fourier coefficients are such that ax = ax and by = bx for every k, then 


f and f coincide. 


Proof With the notations adapted to this situation, we will have that on (t) = 0, (t), 
for every n, and hence 


fO-fO= lim(on (1) — Gn(t)) = 0 
foreveryt € R. a 


” 


We could also define the “complex Fourier coefficients 


Lf? —j 2aky 
k=> f@e T dt 
T Jo 
fork € Z. Setting bo = 0, we see that 


1 . : 
op a fT eth) ith <0, 
5 (ax — idx) ike 0, 
so that 


n 
» Qk 
ins. > cer 
k=—n 
In what follows, we use the notation 


[e,e) n 
» a= tn, ( es al . 


k=—o0o k=-n 


228 8 Numerical Series and Series of Functions 


Corollary 8.29 Let f : R — R be a continuous and T -periodic function. If the 
series i ae |cx| converges, then 


lim fn(t) = ft) 
uniformly for every t € R. 


Proof Observe that 


= |ckl. 


j 2k 
lcxe’ ky 


Hence, if the series )77° 


uniformly converges to some continuous function f :R = R, which is T-periodic. 
On the other hand, for this function, 


_oo [ck| converges, by Theorem 8.14 the sequence (fn) n 


1 Ee - Ik 1 r Ink 
= a Feit dt = al ( lim In(t)) ert at 
T rr 0 n—0o 


Ck = 
= i eH ar = tim Lf ys ieee i tte ay 
= im > f fale =. Zoe 
j=-n 
; gi eGees 1st 
= lim — dt = lim — cy dt = cr 
n—>0o : n> T Jo 

for every k € Z. By Corollary 8.28, the two functions f and 7 coincide, thereby 
completing the proof. | 


In the following theorem, the function f could be discontinuous at some points. 


Theorem 8.30 (Dirichlet Theorem) Let f : R > R be a T-periodic function. 
Assume that there is a finite number of points to, ti, t2,..., tn, with 


O=10 <tr <tp<---<ty=T 


having the property that f is continuously differentiable on every interval |t;-1, tl, 
with j = 1,2,...,.N. At the points tj (where the function could either fail to be 
continuous, or, if continuous, it could fail to be differentiable), the following finite 
limits must exist: 


lim f(s), lim, f(s), lim: f(s), lim, f(s). 


Sot sory sot; sot} 


8.4 Series of Functions 229 


Then, for every t € [0, T], 
a ~~ Ink Ink 1 
++ = (a cos (=) + by sin (=) = 5( lim f(s) + Jim, f(s)) 


Moreover, the convergence is uniform on every compact interval where f is 
continuous. 


Note that if f is continuous at f, then 
eae 
fO= =( lim f(s) + lim f(s) 
2\sst- soit 
For the proofs of Theorems 8.27 and 8.30, we refer the reader to the book by K6rner 
[4]. 
We now provide two examples of applications of the preceding theorem. 
Example 1 Let f : R > R be the 27-periodic function defined as 
f@®=t, if tel[-r,z[. 


It is readily seen that the assumptions of the Dirichlet theorem are satisfied. We 
compute 


1 20 1 ca 
=> t)dt = — tdt=0, 
co = 5 ; f(t) x |, 
2 20 1 au 
a= — f(t) cos(kt) dt = — tcos(kt) dt =0, 
2 Jo TU J—n 


since tf +» t cos(kt) is an odd function; and, integrating by parts, 
2 0 


2 1 
oe f(t) sin(kt) dt = — | tsin(kt) dt 
Qn 0 a 


7 a (|- al +f cos(kt) ir) _ 2(-1)**! 
r a ee ae k 


We can thus state that 


oo 2(-1)"! ; 
foO= > a7 sin(kt) foreveryt €]—z,z7[. 
k=1 


230 8 Numerical Series and Series of Functions 


As a particular case, taking t = +, we obtain the nice formula 


my re. Ls 
4 ce ee 


Example 2 Let f : R > R be the 27-periodic function defined as 
fOH=t, if tel[—n,x[. 


It is readily seen that the assumptions of the Dirichlet theorem are satisfied. We 
compute 


1 20 4 2 


1 
aig t) dt = — rdt-—, 
0 5 I FO In J» 3 


and, integrating by parts, 


20 T 


2 1 
ak = > f(@ cos(kt) dt = — t? cos(kt) dt = 
20 0 a 


1 in(kt) 17 ™  sin(kt 2 7 
=e (Gi sunt -{ pee ar) Sse |. gana gee 
x Be ig ak In 


ca ca k 
2 (|- — +f cos(kt) ar) 7 4(-1) 
ak ie ae, ae 2 


On the other hand, 


20 


2 1 
by = 5 f(t)sin(kt)dt=— | t? sin(kt)dt =0 
2m Jo TU 


=e 


since t +> f? sin(kt) is an odd function. Since f : R — Ris continuous, we can 
then state that 


2 & k 
4(-1 
f@O= > + d - ) cos(kt), foreveryteR. 


Let us focus on two interesting cases. If t = 2, we obtain the formula 


2: Co 
2% 4 
terug Pye 


8.5 Series and Integrals 231 


yielding 
S 1 mn 
Boe. 
k=1 é ° 
if t = 0, we have 
2 x k 
T 4(-1) 
0 — cs + > 2 ’ 
k=1 
giving us the formula 
[o,@) 


8.5 Series and Integrals 


We now prove a theorem that shows the close connection between the theory of 
numerical series and that of the integral. 


Theorem 8.31 Let f : [1, +oo[ > R be a function that is positive, decreasing, and 
integrable on [1, c] for every c > 1. Then the series )-v~., f (k) converges if and 
only if f is integrable on [1, +o0[. Moreover, we have 


+00 ca +00 
/ fesmssoe | 2 


Proof For x € [k,k + 1], it must be that f(k + 1) < f(x) < f(k), hence 
k+1 
fkeve fo fs td. 


Summing up, we obtain 


n n+1 n 
Drkenef > Fed sao. 
k=l 1 k=l 


Since f is positive, the sequence ()°;_, f(k))n and the functionc » f : f are 
both increasing and, therefore, have a limit. The conclusion now follows from the 
comparison theorem for limits. | 


232 8 Numerical Series and Series of Functions 


It should be clear that the choice of the starting point a = | for both the integral 
and the series is by no way mandatory. 


Example Consider the series paar k-3 ; in this case, 


and then 


A greater accuracy is easily attained by computing the sum of the first few terms 
and then using the estimate given by the integral. For example, separating the first 
two terms, we have that 


with 


Check for 
updates 


9.1 Saks—Henstock Theorem 


Let us further analyze the definition of integral for a function f : J — R when 
I = [a, b] is a compact interval. 

The function f is integrable on J with the integral /- , J if, for every e > 0, there 
is a gauge 5 on J such that, for every 6-fine tagged partition 


P ={(x1, [ao, a1]), ..-+ ms [am—1, am} 


of ZT, we have that sn) - SF | < e. Then, since 


S(f.P) = > Fxj)(aj —4j-1), [F- ee 
j=! j=l aj-l 
we have that 


m 


cs (Faq — aj-1) — l 


j=l jl 


This fact tells us that the sum of all “errors” (f(x;)(aj — aj—1) — ae f) is 
arbitrarily small, provided that the tagged partition is sufficiently fine. Note that 
those “errors” may be either positive or negative, so that in the sum they could 
compensate for one another. The following theorem tells us that even the sum of all 
absolute values of those “errors” can be made arbitrarily small. 


Theorem 9.1 (Saks—Henstock Theorem—I) Let f : I — R be an integrable 
function, and let 5 be a gauge on I such that, for every 5-fine tagged partition P 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 233 
A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978- 3-03 1-23713-3_9 


234 9 More on the Integral 


of I, it happens that |S(f, Py J, f| < &. Then for all such tagged partitions 
P = {(x1, [ao, a])), 5B (Xm, [am—1, am])} we also have that 


m 
j=l 


<4e. 


flea; —a;-1) f ‘ f 


aj-1 


Proof We consider separately in the sum the positive and negative terms. Let us 
prove that the sum of the positive terms is less than or equal to 2¢. In an analogous 
way one can proceed for the negative terms. Rearranging the terms in the sum, we 
can assume that the positive ones are the first g terms (f(x ;)(aj —aj—1) — ee qt), 


with j = 1,...,q,1€., 


a aq 
f (x1) (a1 — ao) -| fo cess f(%q) (ag — 4q-1) -[ Nie 
ag ag-l 
Consider the remaining m — q intervals [ax—1, ax], withk =q+1,...,m,ie., 
lag, 4g+1), reek) [am—1, Am] - 


Since f is integrable on these intervals, there exist some gauges 6, on [ax—1, ax], 
respectively, which we can choose such that 6, (x) < 6(x) for every x € [ag—1, ax], 
for which 


suneo- f f 


ak-1 


< 


m—q 


for every 6,-fine tagged partition Px of [ax—1, ax]. Consequently, the family fa) 
made by the couples (x1, [ao, 41]), ..., (%q, [@g—1, aq]) and by the elements of the 


families Py, with k varying from q + | to m, is a 6-fine tagged partition of J such 
that 


q m 
S(f, 9) = >> fay -—aj-+ D> SPs). 
j=l k=q+1 
Then we have 


q 
Yo (Fe@;-4/- v-f ie Y Feaplay—ay- io 


j=l ‘ia =1° 4-1 


= (S(f, 9) - > suo) Hfr- > f' 4) 


k=q+1 k=q+1 


9.1 Saks—Henstock Theorem 235 


< si. - f r+ 3 


k=q+1 


° ak 
ak-1 
Set(m—g) =e. 


Proceeding similarly for the negative terms, the conclusion follows. a 


The following corollary will be useful in the next section to study the integrability 
of the absolute value of an integrable function. 


Corollary 9.2 Let f : I — R be an integrable function, and let 5 be a gauge on I 
such that, for every 8-fine tagged partition P of I, it happens that |S(f,P)— J, f\ < 
€. Then for such tagged partitions P= {(x1, [ao, a1]), .--, (Xm; [dm—1, Gm])} we 


also have 
aj 
ie: 
a 


j-1 


m 


susP-yo 


j=l 


<4e. 


Proof Using the well-known inequalities for the absolute value, by Theorem 9.1, 


[ f||- Y | reta; - a) - [ f\] 
aj-1 j=l aj-1 


m 


sas. - >> 


j=l 
m aj 
< Vl f@p@ -aj-v|- i ‘|| 
j=l at 
m aj 
<\fen@ - a1) f f\ <4e. 
j=l aj-| 
This completes the proof. | 


The Saks—Henstock Theorem 9.1 can be generalized to tagged subpartitions. 
Here is the statement. 


Theorem 9.3 (Saks—Henstock Theorem—II) Let f : J — R be an integrable 
function, and let 5 be a gauge on I such that, for every 6-fine tagged partition P of 
I, it happens that |SCf, P) - iF J | < &. Then for every 5-fine tagged subpartition 
E = {(§;, laj, Bj]): j=1,...,m} of I we have 


m 


2 


j=! 


Bj 
<4e. 


i 


f (Ej) (Bj — aj) — / 


aj 


236 9 More onthe Integral 
As a consequence, for any such 8-fine tagged subpartition, 


m Bj 
sno-D fos 
jal % 


<4e. 


Proof By the Cousin theorem, it is possible to extend any 6-fine tagged subpartition 
& of J to a whole 6-fine tagged partition P of I. Hence, the Saks—Henstock 
Theorem 9.1 applies, proving the first part of the statement. 

Concerning the second part, we have 


m B; 
sna-yf's| = 
j=l % 


m 


m Bj 
Y FEB) - a) - > | f 
j=l jas 


m 


Bj 
Y (renej-an- f f) 


j=l 


m 


a), 
j=l 


<4e, 


Bj 
f (Ej) (Bj — ay) -[ f 


j 


thereby completing the proof. | 


9.2 L-Integrable Functions 


In this section, we introduce another important class of integrable functions on the 
interval J = [a, b]. 


Definition 9.4 We say that an integrable function f : J > R is “L-integrable“ (or 
“integrable according to Lebesgue’’) if even | f| happens to be integrable on J. 


It is clear that every positive integrable function is L-integrable. Moreover, every 
continuous function on [a, b] is L-integrable there since | f'| is still continuous. We 
have the following characterization of L-integrability. 


Proposition 9.5 Let f : 1 — R be an integrable function, and consider the set S 
of all real numbers 


obtained choosing co, C1, ..., Cq in I in sucha way thata = co < c1 < +++ <¢qg = 
b. The function f is L-integrable on I if and only if S is bounded from above. In 


9.2 L-Integrable Functions 237 


that case, we have 


[iri=ss. 


Proof Assume first that f is L-integrable on J. If a = cg < cy < +++ < cg =), 
then f and | f| are integrable on every subinterval [cj_1, cj], and we have 


[s 2a i= fia. 


Consequently, the set S is bounded from above: supS < [ 1 \f| < +00. 
Conversely, assume now that S is bounded from above, and let us prove that in 
that case | f| is integrable on J and ue | f| = supS. Fix ¢ > 0. Let 6; be a gauge 


q 


oe 


i=1 


such that, for every 61-fine tagged partition P of I, we have 


oo. 
~ 8 


sum fr 


On the other hand, letting 7 = supS, by the properties of the supremum there 
surely are a = cy < Cy <--- < Cg = b such that 


[7 


€ q 
oo Se SS 


We construct the gauge 62 in such a way that, for every x € J, it must be that 
[x — d2(x), x + d2(x)] meets only those intervals [c;—1, cj] to which x belongs. In 
this way, 


¢ If x belongs to the interior of one of the intervals [cj—1, cj], we have that [x — 
62(x), x + 62(x)] is contained in Jcj—1, ci[ . 

e If x coincides with one of the c; in the interior of [a, b], then [x —d2(x), x +d2(x)] 
is contained in ]cj—1, cj+1[. 

¢ Ifx =a, then [x, x + 62(x)] is contained in [a, ci[. 

° Ifx =), then [x — d(x), x] is contained in ]cg—1, b]. 


Define d(x) = min{d)(x), 62(x)} for every x € J. Once we take a 6-fine tagged 
partition P = {(x1,[ao, a1]),..., (4m, [am—1, 4m])} of I, consider the intervals 


(possibly empty or reduced to a point) 


Tj,i = laj-1, 4] N[ci-1, ci] . 


238 9 More onthe Integral 


The choice of the gauge 6 yields that, if /;,; has a positive length, then x; € J; 
Indeed, if x; ¢ [cj-1, ci], then 


[aj—1, af] MN [ej-1, e7] S [xj — 62(xy), xj + 820%] NMci-1, 1] = @. 


Therefore, if we take those /;;, then the set 


O=((xj,1ji): jf =l,....m, i=1,...,q, (j,i) > O} 
is a 6-fine tagged partition of J, and we have 


m q 


SUf1.P) = Sof @p)l@j - aj i) feu =a Q). 


j=l j=l i=l 


Moreover, 


m 


q 
a}! "eX Ed, ei ee 
i=l qj i=1 d2 1 Tj,i 
and by Corollary 9.2, 
sari 6) - fs = == 
i=1 j=! 
Consequently, we have 
IS(f1.P)-— Jl =1SUFf1, Q - FI 
qm m 
< sar. o->o]f wip ps s\-7 
i=1 j=1 = i 
<i+ ee 
<5 x7 =F ; 


which is what we wanted to prove. 


We have a series of corollaries. 


Corollary 9.6 Let f, g : I > R be two integrable functions such that, for every 
xel, 


If@) S 8); 


then f is L-integrable on I. 


9.2 L-Integrable Functions 239 


Proof Take co,c1,...,¢q in I so thata = co < cy < +++ < cg = DB. Since 
—g(x) < f(x) < g(x) for every x € I, we have that 


Ci Ci Ci 
-| g <|/ eh 8, 
Ci-1 Ci-1 Ci-1 
Cj Ci 
/ 7 <| 8. 
ci-1 Ci-1 


for every | <i < q. Hence, 


q Ci q Cj 
a i] f\< Se /. z= ie 
jai Ci-1 j=1 7 S-1 I 


Then the set S is bounded above by /, , 8, so that f is L-integrable on J. | 


Corollary 9.7 Let f, g : I — R be two L-integrable functions, and leta € R be a 
constant. Then f + g andaf are L-integrable on I. 


Proof By assumption, f,|/f| and g, |g| are integrable on J. Then f + g, | f|+ gl, 
af, and |a|| f| are, too. On the other hand, for every x € J, 


IF + 9@) S IFC] +1e@I, laf(x)| < lal|f@)I. 
Corollary 9.6 then guarantees that f + g andaf are L-integrable on /. | 


We have thus proved that the L-integrable functions make up a vector subspace 
of the space of integrable functions. 


Corollary 9.8 Let f\, f2: 1 — R be two L-integrable functions. Then min{ f), f2} 
and max{ f1, f2} are L-integrable on I. 


Proof It follows immediately from the formulas 


: 1 1 
mint l= sAt+hR-lA-fAd, maxifi,hl= sit htlA-fb, 
and from Corollary 9.7. a 
Corollary 9.9 A function f : I > R is L-integrable if and only if both its positive 


part f* = max{ f, 0} and its negative part f~ = max{—f, 0} are integrable on I. 


In that case, pia ft-f f°. 


240 9 More onthe Integral 


Proof It follows immediately from Corollary 9.8 and the formulas f = f* — f~, 
Ifl= ft +f. a 


We want to see now an example of an integrable function that is not L-integrable. 
Let f : [0, 1] > R be defined by 


ae 1\., 
fGa= ~sin (5) ifx ~40, 
0 ifx =0. 


Let us define the two auxiliary functions g : [0,1] ~ Randh: [0,1] —~ Ras 


1 1 1 
oye +sin(5) +xe0s (5) ifx #40, 
0 ifx =0, 


INS 
eas x05 (5) ifx #0, 
0 ifx =0. 


It is easily seen that g is primitivable on [0, 1] and that one of its primitives G : 
[0, 1] > Ris given by 


a Ey ate 
Go\= oo = ifx #0, 
0 ifx=0. 


Moreover, / is continuous on [0, I], so it is primitivable there, too. Hence, even the 
function f = g+h is primitivable on [0, 1]. By the Fundamental Theorem, / is then 
integrable on [0, 1]. We will show now that | f| is not integrable on [0, 1]. Consider 
the intervals [((k + 1))~!/?, (kar)~'/*], with k > 1. The function | f| is continuous 


on these intervals, so it is primitivable there. By the substitution y = 1/x?, we 
obtain 
(kx)71/2 1 1 (k+1)x 1 
/ _ = [sin ax =f ae 
((k+1)r)“1/2 ¥ 7 kn y 


On the other hand, 


(k+l) 1 (k+l) 1 
—|sin y| dy > ———— sin y| dy = —_——-. 
L ay! siny| rear ae bce 


9.3 Monotone Convergence Theorem 241 


If | f| were integrable on [0, 1], we would have that, for every n > 1, 


1 ((n+1))~ 1/2 (kor) 1/2 1 
i wm =) ee foal 1/2 fit fil 
A a(kem)-1/2 4 F 
= 3 ee lfl= d + Dr’ 


which is impossible since the series }°7°, re diverges. Hence, f is not L- 
integrable on [0, 1]. 


9.3. Monotone Convergence Theorem 


In this section and the next, we will consider the situation where a sequence of 
integrable functions (f,,), converges pointwise to a function f, i.e., forevery x € I, 


lim f(x) = f(x). 


The question is whether f is integrable on J, with 


[f=io ft 


i.e., whether the following formula holds: 


[im to = tim fs. 
7” n I 


This problem has already been faced in Theorem 7.24, involving continuous 
functions and uniform convergence. We will see now that the formula holds true 
if the sequence of functions is monotone. Let us state the following result, due to 
Beppo Levi. 


Theorem 9.10 (Monotone Convergence Theorem—I) We are given a function 
f : I => Rand a sequence of functions fy, : 1 > R, withn € N, verifying the 
following conditions: 


(a) The sequence (fn)n converges pointwise to f. 
(b) The sequence (fn)n is monotone. 

(c) Each function fy, is integrable on T. 

(d) The real sequence oF Sn)n has a finite limit. 


242 9 More on the Integral 


Then f is integrable on I, and 


[fp =tio [tn 


Proof We assume for definiteness that the sequence (fy )n 1s increasing, i.e., 


Su(®) <= frtit) < ff), 


for every n € N and every x € /. Let us set 


g= lim ff. 
uy I 


We will prove that f is integrable on / and that 7 is its integral. Fix ¢ > 0. Since 
every f, is integrable on J, there are some gauges 6; on J such that if P,, is a 5;-fine 
tagged partition of 7; then 


sum. Po -f t 


E 
< —~. 
— 3. Qn+3 


Moreover, there is an € N such that, for every n > n, it is 


E€ 
0<I- | isk. 


and since the sequence (f;,), converges pointwise on J to f, for every x € I there 
is a natural number n(x) > n such that, for every n > n(x), one has 


€ 
lfn(x) — f@)| < Raye 


Let us define the gauge 6 in the following way. For every x ¢€ J, 


5(x) = 8% (x). 


Now let P = {(x1, [ao, 41]),---, (Xm; [€m—1, Gn])} be a 6-fine tagged partition of 
I. We have 
ISP) — Fl =| > fa) @j - aj- »-g| 


1 


j 


< 


Y LF: — fino Aaj — aj-1) 


j=l 


9.3 Monotone Convergence Theorem 243 


m 


+|>0 fas a) — aj- n-f fnvx;) 


j=l 


aj 
fnix;) 4 a 


aj-1 


Estimation of the first term gives 


SFA) = fay Opa; —4aj-D) < YOIF GQ) = face) Ij — aj-1) 


j=l j=l 


m 


E 
= eo 


To estimate the second term, set 


r= min n(xj), s= max n(xj), 
l<j<m l<j<m 


and note that, putting together the terms whose indices n(x ;) coincide with the same 
value k, by the second statement of Saks—Henstock Theorem 9.3, we obtain 


m 


> fun a) = aja) = i | 
j=l ce 
> | > fiesta =a) = / fi} 
{ ae 


l<j<m: n(x j)=k} 


3 2 


r {l<js<m:n(x;)=k} 


fixj)(aj —aj-1) ~ | ‘ tk 


j-1 


& 
3. Qk+3 


= 


244 9 More onthe Integral 


Concerning the third term, since r > n, using the monotonicity of the sequence 
(fn)n we have 


0<g-f h-7- fe< 
I j=l aj-1 
m a 
< I ba fnix;) = 
j=" 4-1 
m a " 
<.J — f= I-f hes. 
j 1 aj-1 T 3 
from which 
m aj 
ay fn(x;) = I < 3 : 
j=l aj-1 
Hence, 
2 é € & 
S(f, P) - <rst+rst+5=6, 
ISCf, P) — J| 313t3 
and the proof is thus completed. | 


As an immediate consequence of the Monotone Convergence Theorem 9.10, we 
have an analogous statement for a series of functions. 


Corollary 9.11 We are given a function f : I — Randa sequence of functions 
fk: I > R, withk €N, verifying the following conditions: 


(a) The series }~, fx converges pointwise to f. 

(b) For every k € N and every x € I, we have fx(x) = 0. 
(c) Each function fx is integrable on I. 

(d) The series )°,( i} Jk) converges. 


[f=X] #- 


Then f is integrable on I, and 


9.4 Dominated Convergence Theorem 245 


We can then write 
[Xs = | fe 
y rel 


s ‘ ° . ri 2 
Example Consider the Taylor series associated with the function f(x) =e , 
CO 2k 
oo . x 
= 2 ki - 
k=0 


The functions fx (x) = — satisfy the first three assumptions of Corollary 9.11, with 
I = [a, b] and 


b okt b pk+l — g2kt1 
d = ———————— = ————_ 
i: Haayen = al (2k + Dk! 


so it can be seen that the series >, (/; , Jk) converges. It is then possible to apply the 
corollary, thereby obtaining 


b OO pk] _ yg 2k+l 
i et dx => ee 
. 2 (2k + 1)k! 


In particular, considering the integral function te f, we find an expression for the 


kid a 
primitives of e* , ie., 


i 2 oo x 2k+1 
e dx= ——- +c. 
= (2k + 1)k! 


9.4 Dominated Convergence Theorem 
We start by proving the following preliminary result. 


Lemma 9.12 Let ff, fo,..., fy : 1 > R be integrable functions. If there exists an 
integrable function g : I > R such that 


g(x) < fe(x), foreveryx € landk e€ {1,...,n}, 


then min{ f, fo,.-.., fn} and max{ fi, fo,..., fn} are integrable on I. 


246 9 More onthe Integral 


Proof Consider the case n = 2. The functions fi — g and f2 — g, being integrable 
and nonnegative, are L-integrable. Hence, min{ f| — g, fo—g} and max{ fi —g, fo— 
g} are L-integrable, by Corollary 9.8. The conclusion then follows from the fact that 


min{ fi, f2} = min{ fi — g, fa—g} +8, 
max{ fi, f2} = max{fi — g, fo—g}+es. 
The general case can be easily obtained by induction. | 


We are now ready to state and prove the following important extension of 
Theorem 7.24 due to Henri Lebesgue. 


Theorem 9.13 (Dominated Convergence Theorem-I) We are given a function 
f : I = Randa sequence of functions f, : I > R, withn € N, verifying 
the following conditions: 


(a) The sequence (fn)n converges pointwise to f. 
(b) Each function fn is integrable on I. 
(c) There are two integrable functions g,h : I — R for which 


8(X) < fn) < A(x) 
for everyn € Nandx € 1. 


Then the sequence er In) has a finite limit, f is integrable on I, and 


[f=tio [tn 


Proof For any couple of natural numbers n, £, define the functions 


one = min{ fn, Snes tees Sn+e} ; Dre = max{ fn, Tn41> sey Tn+e} . 


By Lemma 9.12, all @n,¢ and ®,,¢ are integrable on J. Moreover, for any fixed n, 
the sequence (@,,¢)¢ is decreasing and bounded from below by g, and the sequence 
(®,.¢)e is increasing and bounded from above by h. Hence, these sequences 
converge to the two functions ¢, and ®,, respectively: 


lim $n,¢ = ¢n = inf{ fn, Tn+1; Seay By Dy ¢ =, = sup{ fn, Sn+i, isoys 


Furthermore, the sequence ( J, 7 Once is decreasing and bounded from below by i 18 
whereas the sequence ( J 7 n,e)e 1s increasing and bounded from above by i? ,/. The 
Monotone Convergence Theorem 9.10 then guarantees that the functions ¢, and ®, 
are integrable on /. 


9.5 Hake’s Theorem 247 


Now the sequence (¢,)n is increasing, and the sequence (®,,), is decreasing; as 
lim, fr = f, we must have 


lim@¢, = liminf f, = f, lim®, = limsup f, = f. 
n n n n 


Moreover, the sequence (/ 7 9n)n is increasing and bounded from above by i yh, 
whereas the sequence (/' 7 &n)n is decreasing and bounded from below by f 18. We 
can then apply again the Monotone Convergence Theorem 9.10, from which we 
deduce that f is integrable on J and 


[7 =i fon = tim fon: 
I nm JT nm ST 


Since dn < fn < Pn, we have {7 dn < f7 fn < J; Pn, and the conclusion follows 
by the Squeeze Theorem 3.10. a 


Example Consider, for n > 1, the functions f, : [0,3] — R defined by f,(x) = 


2 . : . 
arctan (nx = =) . We have the following situation: 


n+l 

7 ifx € (0,1, 
7 Wx ’ ’ 
2 
T, 

tpejaet™ feet, 

n 4 
* fee 11,31 
— ifx ,3). 
2 


Moreover, 


I fn(x)| < = for every n € N and x € [0,3]. 


The assumptions of the Dominated Convergence Theorem 9.13 are then satisfied, 
taking the two constant functions g(x) = —4, h(x) = 4. We can then conclude 


that 
‘i [ : = d 2 eee 
im arctan | nx — x=-—- —_-=_—, 
n Jo n+1 2. 2 2 


9.5 Hake’s Theorem 


Recall that a function f : [a, b[ > R is said to be integrable if it is integrable on 
[a, c] for every c € Ja, b[ , and the limit 


lim 
c>b- : f 


exists and is finite. We want to prove the following result by Heinrich Hake. 


248 9 More onthe Integral 


Theorem 9.14 (Hake’s Theorem) Let b < +00, and assume that f : [a,b[—> R 
is a function that is integrable on [a,c], for every c € Ja, b[. Then the function f 
is integrable on [a, b{ if and only if it is the restriction of an integrable function 


f : [a,b] = R. In that case, 
b b 
Lae 


Proof Assume first that f is the restriction to [a, b[ of an integrable function f : 
[a,b] — R. Fix ¢ > 0; we want to find a y > O such that, if c € Ja, b[ and 


b—c<y, then 
c b_ 
[r-[i 
a a 


Let 5 be a gauge such that, for every 6-fine tagged partition of [a,b], we have 
IS(f, P) - Fi f\ = gz. We choose a positive constant y < 4(b) such that 


y|f(b)| < 5. lfc € Ja, b| and b—c < y, then, by the Saks—Henstock Theorem 9.1, 
taking the 5-fine tagged subpartition P= {(b, [c, b])}, we have 


<e. 


se 
i 2 


b 
foo-o- | f : 


and hence 


“pe = ” 
Le-LaHlf 


2 Zips + 
=5 Y¥=5 


b 
i f-fob-oO}+IFOO-o| 


= 


€ 
Se. 
2 

Let us prove now the other implication. Assume that f is integrable on [a, b[ , 
and let 7 be its integral, i.e., 


c 
J = lim he 
cob” Ja 
We extend f to a function f defined on the whole interval [a, b] by setting, for 
instance, f(b) = 0. To prove that f is integrable on [a, b] with integral 7, fix 
€ > 0. By the preceding limit, there is a y > 0 such that, ifc € Ja, bl andb—c < y, 


then 
[s-7 


é€ 
oe 
iD 


9.5 Hake’s Theorem 249 


Consider the sequence (c;); of points in [a, b[ given by 


Note that it is strictly increasing, it converges to b, and it is co = a. Since f is 
integrable on each interval [cj_1, c;], we can consider, for eachi > 1, a gauge 6; on 
[cj—1, c;] such that, for every 6;-fine tagged partition P; of [c;-1, ci], we have 


3 ci € 
sto - f f\ = sr 
Ci-1 


We define a gauge 6 on [a, b] by setting 


min }3;(x) , — is ifx €lo—1,cif, 


min sca), =} eo, 


— Ci-1 a—*| 


5(x) = 


Ci —Ci-1 Ci41 — Ci 
—— , —_—_ 


min 4 5;(c;) , 8;41(ci) , Jitx = and i>1, 


7 ifx=b. 
Let P = {(v;, laj-1, a;]) : j = 1,...,m} be a d-fine tagged partition of [a, b]. 
Denote by q the smallest integer for which cg+1 = dm—1. The choice of the gauge 


allows us to split the Riemann sum, much like in the proof of Theorem 7.18 on the 
additivity of the integral on subintervals, so that the sum S(f, P) will contain 


e the Riemann sums on [cj-1, cj], withi = 1,...,q; 
* a Riemann sum on [cg, dm—1]; 
¢ alast term f(%»)(b — am_1). 


(The first line disappears if g = 0.) Let P; be the tagged partition of [cj—1, cj] and 
O be the tagged partition of [¢qg, €m—1] whose intervals are those of P. Then 


q 
SFP) = > SL. Pi) + Sf, QD) + Fam) (b = amr). 


i=1 


To better clarify what was just said, assume, for example, that g = 2; then there 
must be a J; for which x7, = c, and a jz for which xj, = cz. Then 


Sf, P) = [f Gi)ar — a) +..-+ f (47,1) (7-1 — aj,-2) + Fler) (Cr — 2,1 
+f (er) (@j, — 1) +...-4+ f&p-1) (Gp-1 — aj~2) + f (C2) (C2 — 47-1] 


250 9 More onthe Integral 


+L f (c2) (Gj, — €2) + +++ + f Om—1)(Gn-1 — Gn-2)] 
+ f Xm)(b — dm—1). 
Note that P; is a 6;-fine tagged partition of [cj_1, c;] and that a) is a dg+41-fine 


tagged subpartition of [cg, cg+1]. Moreover, by the choice of the gauge 6, it must be 
that x, = b and, hence, f(%m) = 0 and b — ay_1 < 5(b) = y. Using the fact that 


am—-1 gq Cj dm-1 
b fey) eee 
a j=] 2 Ci-! cq 


by the Saks—Henstock Theorem 9.3 we have 


fond 


sip. — f f|+ 
Ci-1 


SH 7)< 


ee. am—1 
S(f,P) =| f cs 


=) 


1 


U 


° am-\ 
S(f, Q) -[ f a 


q 


fon 


and the proof is thus completed. a 


The above theorem suggests that even for a function f : [a,t+oo[— R the 
definition of the integral could be reduced to that of a usual integral. Indeed, fixing 
arbitrarily b > a, we could define a continuously differentiable strictly increasing 
auxiliary function g : [a, b[ > R such that g(a) = a and lim,_,»,- g@(u) = +00; 
for example, take g(u) = a+ In poe . A formal change of variables then gives 


+00 b 
i ore / F(oWu))o"(u) du, 


and Hake’s theorem applies to this last integral. 

With this idea in mind, it is possible to prove that f : [a, +oo[ — R is integrable 
and its integral is a real number 7 if and only if for every ¢ > 0 there is a gauge 4, 
defined on [a, +o00[, and a positive constant a such that, if 


a=ay <a, <-::<dm_-1, with dn_1>a, 


9.5 Hake’s Theorem 251 
and, for every j = 1,...,m — 1, the points x; € [aj-1, a;] satisfy 
Xj —@j-1 5 (xj) and aj-xjX 5(x;) : 


then 


m—1 
Ss f(xj(aj —aj-1)- J| Se. 


j=l 


We refer to Bartle’s book [1] for a complete treatment of this case. 
Needless to say, similar considerations can be made in the case where the 
function f is defined on an interval of the type Ja, b], with a > —oo. 


Part lV 


Differential and Integral Calculus in RY 


® 


Check for 
| updates 


Let O C RY be an open set, 2 a point of O, and f : O > R™ a given function. 
We want to extend the notion of derivative of f at ao already known in the case 
M = N = 1. The definition, inspired by Theorem 6.2, follows. 


Definition 10.1 We say that f is “differentiable” at xo if there exists a linear 
function ¢ : RY — R” for which we can write 


f(x) = f(@o) + (@ — %o) + r(x), 
where r is a function satisfying 


; r(x) 
lim ——— = 
aap || — xo|| 
If f is differentiable at xo, then the linear function @ is called the “differential” of 
f at Zo and is denoted by 


df (x0). 


Following the tradition for linear functions, taking h € RY, we will often write 
df (ao)h instead of df (xo) (h). 

Assuming that © is an open set is not really necessary, but guarantees the 
uniqueness of the differential and it simplifies many issues. In what follows, 
however, we will sometimes encounter situations where the domain is not open. 
More care will be needed in these cases. 

We will now first concentrate for a while on the simpler case M = 1. 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 255 
A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978- 3-03 1-23713-3_10 


256 10 The Differential 


10.1 The Differential of a Scalar-Valued Function 


Assume, for simplicity, that M = 1. We start by fixing a “direction,” 1.e., a vector 
v € RN, with ||v|| = 1, also called a “unit vector.” Whenever it exists, we call 
“directional derivative” of f at ao in the direction v the limit 


f(ao + tv) — f (a0) 


lim 
t0 t 
which will be denoted by 
of 
—(azo0). 
ae 0) 
If v coincides with an element e ; of the canonical basis (€1, €2,..., En) of RY the 


directional derivative is called the jth “partial derivative” of f at xo and is denoted 
by 


of 
——(Z0). 
aa 0) 
Nir Cape. een 8 Patt 
) x te;)— f(x 
F ayy = tim Lot ted) = fo) 
OX j t>0 t 
0 ,0 0 0 0 ,0 0 0 
Ba ae ee eI Oe Sod Ea 
t—0 t 


so that it is commonly called “partial derivative with respect to the jth variable.” 
The following theorem shows, among other things, that the differential is unique. 


Theorem 10.2 If f is differentiable at Xo, then f is continuous at xo. Moreover, 
all the directional derivatives of f at x exist: For every direction v € R™ we have 


~ (a0) = df (aw. 
Uv 


Proof We know that the function £ = df (ao), being linear, is continuous, and 
£(0) = 0. Then 


Jim. f(@) = aim Lf (@o) + (x — x0) +r(x)] 


= f(@o) + (0) + lim r(@) 


10.1. The Differential of a Scalar-Valued Function 257 


: r(a@e : 
= f(@o)+ lim me |x — aol 
2x0 ||az Xoll 2x0 


= f(£o), 
showing that f is continuous at 29. Concerning the directional derivatives, we have 


feo +tv)— fo) _ ,,, Feov) +r@o+ tv) 


lim li 
t>0 t t>0 t 
. tdf(xo)vu+r(xeo+tv) 
= kn —— 
t>0 t 
t 
= df (ao)v + lim r(@o + tv) 
t-0 
On the other hand, since ||v|| = 1, the change of variables formula (3.1) gives us 
_ |r (&o + tv) ’ Ir(x)| 
lim | —————| = lim ———— = 0, 
t>0 exo || — Xo 
whence the conclusion. | 


In particular, if v coincides with an element e; of the canonical basis (€1, e2, 
...,€y), then 


ri) 
(ao) = df (axo)e; . 
Xj 


Writing the vector h ¢ R% ash = hye; + hoe. +---- + hyey, by linearity we 
have 


df (ao)h =hydf (xo)e; + hodf (xo0)e2 +--- +hndf (xo)en 


ie (0) + ia feeet ego) , 
xX] 0x2 OXN 


= h,;— 
“a 


N af 
df (ao)h = ¥ | —(@o)hj, 
a 


If we define the “gradient” of f at ao as the vector 


_ (af af 
Vf (a0) = (0)... 5—(@o)) 


258 10 The Differential 


we can then write 
df (ao)h = V f(ao0)-h. 


Remark 10.3 The mere existence of the directional derivatives at some point ro 
does not guarantee differentiability there. For example, the function f : R? > R, 
defined as 


es 
fa y=} Ga ype EEF GO, 


0 if (x, y) = (0,0), 


has all its directional derivatives at aq = (0, 0) equal to 0. However, it is not even 
continuous there since its restriction to the parabola {(x, y) : y = x7} is constantly 
equal to 7 


Here is a result showing that the existence of the partial derivatives is sufficient 
for the differentiability, provided that they are continuous. 


Theorem 10.4 Jf f has partial derivatives defined in a neighborhood of xo, and 
they are continuous at xq, then f is differentiable at Xo. 


Proof To simplify the notations, we will assume that N = 2. We define the function 
e:R? +R associating to every vector h = (hj, hz) the real number 


e(h) = Feo) + =F po)ha. 
Ox] 0x2 


We will prove that @ is indeed the differential of f at a. First of all, it is readily 
verified that it is linear. Moreover, writing rp = eee x) and x = (x1, x2), by the 
Lagrange Mean Value Theorem 6.11 we have 


f(x) — f(@o) = (fF rr, x2) — f(x, x2)) + FP, x2) — F(xP, x9) 


0 ) 
= aa x2)(x1 — xf) + ut. £)(x2 — x9) 


for some & € he x;[ and & € I x2[. Hence, 


r(x) = f(x) — f(x) — &(@ — Xo) 


0 0 
= | ees.) - oF xt a8)| (x1 — x9) 


0 0 
4 | Lot, ie wt. | Gan: 


10.2 Some Computational Rules 259 


Then, since |x; — x? < < ||@ — Xol| and |x2 —x9\< < ||lz — Zoll, 


Ir (@)| a af 


lz — ea = an 1; NO) St ae xo, x9)| + 


af 


+|at. &) — ay Ol) - 


Letting x pee to Zo, we have that (&,, x2) > (ay. #2) and eae &) > Ct x9) so 
that, since a and 5 & are continuous at @o = Ge, x); it must be that 


rw) 


x2 |x — xol| 
whence the conclusion. |_| 


We say that f : O — R is “differentiable” if it is so at every point of O; it is “of 
class C!” or “a C!-function” if it has partial derivatives that are continuous on the 
whole domain QO. From the previous theorem we have that a function of class C! is 
surely differentiable. 

Assume now that f is defined on some domain PD that is not an open set. In this 
case, we say that f is “differentiable” if it is the restriction of some differentiable 
function defined on an open set O containing D, and similarly when f is “of class 
ce 


10.2 Some Computational Rules 
Let us start with some simple propositions. 


Proposition 10.5 If f : O — R is constant, then df (xo) = 0 for every xo € O. 


Proof Let f(a) = c for every x € O. Then, setting €(h) = 0 for every h € RX, 


f(a) — f(xo) — (a — #0) =c—c—-0=0 
for every x € O. a 
Proposition 10.6 If A: R” — R is linear, then dA(ao0) = A for every xo € O. 


Proof Let f(a) = Ax for every x € R%. Then, setting ¢(h) = Ah, by linearity, 


f(x) — f (x0) — €(@ — x0) = Ax — Axo — A(x — Xo) = 0 


for every x € O. a 


260 10 The Differential 


Proposition 10.7 If B : R%! x R42 —> R is bilinear, writing 29 = (xo, yo) with 
xo € R™!, yo € R®, and and h = (h, k), with h € R™'!, k € R®, we have 


dB(x0)(h) = Bh, yo) + B(xo, k). 


Proof Writing x = (x, y) with x € R™!, y € R%, let f(a) = B(x, y). Then, 
setting €(h) = Bh, yo) + B(xo, k), we compute 


r(x) = f(x) — f(x) — £(@ — x) 
= B(x, y) — B(xo, yo) — B(x — xo, yo) — B(xo, y — yo) 


= B(x — x0, y — yo) = B(a@ — XO). 
Denoting by e1,...,en, the vectors of the canonical basis of R and by 


€),..., €y, those of the canonical basis of R?, for every x €¢ R™! and y € RY? 
we have that 


Ni N2 Ni N2 
Bix.) =B( Donen. Yvi2)) = ob. €). 
i=l j=l i=l j=1 
hence there is a constant C such that 
|B(x, y)| <Cllxllllyll for every x Ee R™',y Ee R™. 
Then 


|B(a@ — x0)| = |B(x — x0, y — yo)| < Cllx — xoll ly — yoll < Cll — aol’, 


whence, if x ~ Zo, then 


r(a) |B(x — x0) 
SM pele ES <Cllx — xoll, 
|| — Xol| || — aol| 
and finally 
r@)l 
x2 || — ©oll 
The statement is thus proved. | 


We now compute the differential of the sum of two functions and the product 
with some constants. 


10.3 Twice Differentiable Functions 261 


Proposition 10.8 If f,g : O — R are differentiable at Zo and a, B are two real 
numbers, then 


d(af + Bg)(xo) = adf (xo) + Bdg(xo). 
Proof Writing 
f(x) = f(Xo) + df (Xo)(@ — Lo) + ri (X), 
8(XL) = g(Xo) + dg(xo)(@ — Xo) + 12(X), 
we have that 
(af + Bg)(x) = (af + Bg)(@o) + (adf (Xo) + Bdg(xo))(x — 0) + r(x), 
with r(a) = ar) (a) + Bro(a), and 


wm 7) tim 7) mm 2) 
2m |e — eo] #80 |e — Boll” #20 [le — aol 


Hence, af + Bg is differentiable at 2 with differential adf (ao) + Bdg(xo). Hl 


10.3 Twice Differentiable Functions 


Let O be an open subset of R”, and f : O > R be a differentiable function. 
We want to extend the notion of “second derivative,” which is well known in the 
case N = 1. For simplicity’s sake, let us deal with the case N = 2. If the partial 
derivatives aL, aL : O — R have themselves partial derivatives at a point Xo, 
these are said to be “second-order partial derivatives” of f at xo and are denoted by 


af a of af a of 
ee = ax aH anda oe = On axe’ 
Df a af of a of 
530s (Xo) = a des Pea = ax aa 


Here is a relation involving the “mixed derivatives.” 


Theorem 10.9 (Schwarz Theorem) [f the second-order mixed partial derivatives 
af, af 


Tea Tae exist in a neighborhood of Xo and they are continuous at xo, then 


2 2 
ce, eee 


0Xx20X4 0x1 0X2 


(Xo) . 


262 10 The Differential 


Proof Let p > 0 be such that B(a@o, 0) C O. We write zo = Ce x0) and we take 


an & = (x1, x2) € B(Xo, e) such that x; ~ ay and x2 4 x It is then possible to 
define 


am 0 - 0 
ia ee Oe oe ceca ca 
X2 — Xy x1 — xt 


We can verify that 


(x1, x2) — g(xp,x2) Arr, x2) — hr, x9) 
xX{ — x? x2 — x9 : 


By the Lagrange Mean Value Theorem 6.11, there is a | € 1a: x,[ such that 


g (x1, X2) — (xp, x2) as FE (Ei, x2) — FEE, x9) 
— a a 
X41 — xy X2 — X5 


and there is a &) € Eee x2[ such that 


h(xi,x2) — h(x, x9) ah Jag 1 82) ~ Fy @P, 82) 
>? = — (x1, &) = ne | . 
2-25 0x2 X= *y 


Again by the Lagrange Mean Value Theorem 6.11, there is a n2 € ee x2[ such that 


Em) - FE) ary 


ae = oe (1,2), 
and there is a 71 € ie x,[ such that 
PE (a &)- PLOY) af 
2 ae (nm. &). 
Xp Xx] 0x10x2 
Hence, 
: of 
> (81,02) = (m1, &2). 
0x20X1 dx2 
Taking the limit, as x = (x1,x2) tends to @ = (aes we have that both 


(€1, 72) and (71, 2) converge to Zo, and the continuity of the second-order partial 
derivatives leads to the conclusion. |_| 


10.4. Taylor Formula 263 


We say that f : O > R is “of class C*” or “a C?-function” if all its second-order 
partial derivatives exist and are continuous on O. 
It is useful to consider the “Hessian matrix” of f at ao: 


Ix 0x] 


Hf (x0) = : 
af a2 
a (wo) aah (@o) 


Dep 2 
24 (@0) god-(@o) 
1 


if f is of class C”, then this is a symmetric matrix. 

What was just said extends without difficulty for any N > 2; if f is of class C a 
then the Hessian matrix is an N x N symmetric matrix. 

One can further define by induction the nth-order partial derivatives. It is said 
that f : O > R is “of class C”” or “a C”-function” if all its nth-order partial 
derivatives exist and are continuous on QO. 


10.4 Taylor Formula 


Let © be an open subset of R”, and assume that f : O > R isa function of class 
C"*! for some n > 1. 

As previously, for simplicity we will deal with the case N = 2. Let us introduce 
the following notations: 


7] 7] 
Dy, re Ne ee 
ee a2 ae a2 02 a2 
x1 ax? , xy Y¥x2 = x1 0x9” x2 axe , 


and so on for the higher-order derivatives. Note that for any vector h = (hy, h2) € 
R2, 


df (ao)h = hy Dy, f (@0) + h2Dyy f (0), 
which can also be written 
df (xo)h = [hy Dy, + h2Dy,) f (£0) - 


In this way, we can think that the function f is transformed by the “operator” 
[h, Dy, + h2Dy,,] into the new function 


[hy Dy, +hgDolf = hi Dy f +heDyf. 


264 10 The Differential 


Given two points 29 4 x in RN, the “segment” joining them is defined by 
[Xo, ©] = {Lo + t(@ — ao): 1 € [0, 1]}; 
similarly, we will write 
]xo, e[= {Xo + t(@ — Ho): t €)0, IL}. 


Assume now that the segment [xo, 2] is contained in O, and consider the function 
@ : (0, 1] > R defined as 


P(t) = f(xo + (x — xo)). 


We will prove that ¢ is n + 1 times differentiable on [0, 1]. For any ¢ € [0, 1], since 
f is differentiable at wo = Xp + t(X — Xo), we have that 


f(u) = f(uo) + df (uo)(u — Uo) + r(u), 


with 
r(u) 
Hn = 
u> U0 || = u0|| 
Hence, 
am PAPO) _ yn, flar0 + 8(@ = &0)) — f(ao + t(@ ~ 0)) 
lim —_—o lim Me RMR ROR OR == NT Ne 
sot s—t sot vay 
Sap er oe SOS ee 0) eo Se oy) 
= sot s— t 
= df (ao + t(a@ — a))(@ — @o) + lim eye) 
and since 
: r(2o + s(x — 2o)) : Ir (w)| 
lim | —————_————| = lim —_———_|lx — xoll = 0, 
sot St u> U9 || 2 = U0 || 


we have that 


p(s) — ot) 
aS 


¢'(t) = lim 


sot 


= df(x0 +t(x — &0))(x — 20). 


With the new notations, setting # — a = h = (hj, ho), we can write 


'(t) = [1 Dx, + ho Dx] f (@o + t(& — Lo)) = g(Lo + t(e — Xo)), 


10.4. Taylor Formula 265 


where g is the function [h; Dy, + h2D,,]f. We can then iterate the procedure and 
compute the second derivative 


g(t) = [h1 Dy, + h2Dy, |g (#0 + t(@ — 20)) 
= [h1 Dy, + h2Dy,|[h1 Dy, + ho2Dy, | f (@o + t(@ — 2o)). 


For briefness, we will write 
" (t) = [A Dy, + haDyyV f (ao + t(@ — a). 


Notice that, by the linearity of the partial derivatives and the equality of the second- 
order mixed derivatives (Schwarz Theorem 10.9), 


[1 Dx, + haDolPf = hi De, f + 2hihyDe, Duy f + 13D; f 
= [h{ Dz, + 2hih2Dx, Dy) + h3Dz,)f . 
We now observe that the equality 
[Ai Dy, + h2Dxy}° = [h{ D2, + 2hih2Dx, Dx, +13D%,] 
is formally obtained as the square of a binomial. Proceeding in this way, we can 
prove by induction that, fork = 1,2,...,n + 1, the formula for the kth derivative 
of ¢ is 
ot) = [a Dy, +hoDyyl* f (wo + t(@ — @0)), 
and, using the binomial formula 


k 


IN. ge 3 

(a) +.a2)k = > ( ‘at ‘@., 
j=o 4 

we formally have that 
ay 
a ee 
[h1 Dx, a ho Dx, = » (5)ai "jad; ‘Dt, | 
j=0 


(in this formula, the symbols De and D simply denote the identity operator). 
To write the Taylor formula, let us introduce the notation 


d* f (ao) h* = [hy Dy, +h2Dy,} f (ato) - 


266 10 The Differential 
Theorem 10.10 (Taylor Theorem—III Let f : O — R be a function of class 
c"+! and [a, x] be a segment contained in O. Then there exists a E €|xo, x[ 
such that 
f(@) = Pn(@) + rn(2), 
where 
1 2 2, 
Pn(&) = f (Lo) + df (xo)(# — Lo) + The f(xo)(@ — Lo) 


1 
see abe ao f ox = Xo)" 


is the “nth-order Taylor polynomial associated with the function f at the point xo”, 
and 


rn(a) = d"*" F(E)(a — ao)" 


(n+ 1)! 


is the “Lagrange form of the remainder.” 


Proof Applying the Taylor formula to the function @, we have that 


b(t) =6(0) + ¢' Or + 50" feet Og + Ft _ git gyn! 


(n+ 1)! 
for some € €]0,t[. We thus directly conclude the proof taking tf = 1 and 
substituting the values of the derivatives of ¢ computed earlier. | 


The Taylor polynomial can be expressed as 


ll 
Pn(@) = 2, qt f(@o@ —2xo)*, 


with the convention that d° f(@o0)(a@ — ay)”, the first addend in the sum, is simply 
Ff (@o). Hence, 


Pr(£) = se = [(s: _ x?) Dy, + (x2 - x4) Dey} f (x0) 


k! 
k\ af Oj ne 
( js (Ho) (x1 — x7)" 4 (42 — x9) ) 
=0 


Jj axt /axd 


10.5 The Search for Maxima and Minima 267 


Here is a useful expression for the second-order polynomial: 
p2(ae) = fo) + Vf (ao) « (@ — wo) + $(Hf @o) a — &0)) - (w ~ ao). 


The foregoing proved theorem remains valid for any dimension N when the nota- 
tions are properly interpreted. For example, for any vector h = (hj, ho,..., hw), 


d* f (ao)h* = [hy Dy, + hoDy, +--+: + hn Dey lf (a0). 


In this case, when writing explicitly the Taylor polynomial, the following general- 
ization of the binomial formula will be useful: 


k! 
(aj tay +---+ ay)‘ = > —— i 8 
m,!m2!---my! 
mi+m2+--+my =k 


10.5 The Search for Maxima and Minima 


As earlier, let O C RY, the domain of our function f : O = R, be an open set. 
Recall that xo € O is a “local maximum point” for f if there exists a neighborhood 
U C O of &o such that f(U) has a maximum and f (a0) = max f(U). A similar 
definition holds for “local minimum point.” 


Theorem 10.11 (Fermat’s Theorem—II) Assume that O is an open set and 
f : O => R is differentiable at xo € O. If, moreover, xo is a local maximum 
or minimum point for f, then V f (xo) = 0. 


Proof \f xo is a local maximum point, then for every direction v € RY there is a 
6 > 0 for which 


f(ao+tv) — f (Xo) >0 if —éd<r<0O, 
t <0 if O<t<6. 


Since f is differentiable at ao, we necessarily have that 


a tv) — 

tay asin feo + tv) — Fo) _ 5 
av t>0 t 

In particular, all partial derivatives are equal to zero, hence V f (a9) = 0. When xo 
is a local minimum point, the proof is similar. | 


A point where the gradient vanishes is called a “stationary point.” We know 
already from the case N = | that such a point could be neither a local maximum 
nor a local minimum point. 


268 10 The Differential 


We will now show how the Taylor formula provides a criterion establishing when 
a Stationary point is either a local maximum or a local minimum point. Let us start 
with a definition. 

We say that a symmetric N x N matrix A is “positive definite” if 


[Ah]-h>0,  foreveryh eR’ \ {0}. 


In contrast, we say that A is “negative definite” if the opposite inequality holds, i.e., 
when —A is positive definite. 


Theorem 10.12 [f x0 is a stationary point and f is of the class C”, with a positive 
definite Hessian matrix H f (a), then Xo is a local minimum point. In contrast, if 
Hf (Xo) is negative definite, then Xo is a local maximum point. 


Proof By the Taylor formula, for any x ~ Xo in a neighborhood of 20 there exists 
aé&é €]ao, x[ for which 


f(@) = f (ao) + Vf @o) (w= @o) + 5(HS(E)@ — w)) - (@— a0). 


If A = Hf (Zo) is positive definite, there is a constant c > 0 such that, for every 
v € RN with ||v|| = 1, 


[Av] -v>c. 


(We have used Weierstrass’ Theorem 4.10 and the fact that the sphere {v € RY : 
||| = 1} is a compact set.) Hence, 


Gio 0) 


=) xL— 20 
jz — xo||/ lla — aol] ~ 


Recalling the continuity of the second derivatives, if 2 ~ 20 is sufficiently near xo, 
then 


(ure) 


xr aot): xL— 20 1 
| — Xo|| 


(This can be proved by contradiction using the compactness of the sphere again.) 
Since V f (20) = 0, for such x we have that 


f(w) = f (ao) + $(HfE)@ — xo)) - (@ — x0) 
> f(wo) + Zell — aol? > fo), 


hence Zo is a local minimum point. 
The proof of the second statement is analogous. a 


10.6 Implicit Function Theorem: First Statement 269 


We now state (without proof) two useful criteria for determining when a 
symmetric N x N matrix A is positive definite or negative definite. We recall that 
all the eigenvalues of a symmetric matrix are real. 


First Criterion The symmetric matrix A is positive definite if and only if all its 
eigenvalues are positive. It is negative definite if and only if all its eigenvalues are 
negative. 


Second Criterion The symmetric matrix A = (a;;)j;j is positive definite if and 
only if 


ail > 0, 
a a 
det ( ‘ee [) >0, 
a2\ 422 
Q\1 412 413 
det | a2) a22 a23 J > OO, ... 
431 432 433 
a1] @12 -+* Gin 
a21 422 ++: G2N 


det “oe : >0. 
aN1 4N2 °°: aNN 
It is negative definite if and only if the foregoing written determinants have an 
alternating sign: those of the M x M submatrices with M odd are negative, while 
those with M even are positive. 
10.6 Implicit Function Theorem: First Statement 
We are now concerned with a problem involving a general equation of the type 
g(x,y) =0. 
The question is whether or not for the solutions (x, y) of this equation it is possible 


to derive y as a function of x, say, y = n(x). As a typical example, let g(x, y) = 
x* + y? — 1, so that the equation becomes 


whose solutions lie on the unitary circle S'!. The answer to the preceding question, 
in this case, could be positive provided that we restrict our analysis to a small 


270 10 The Differential 


neighborhood of some particular solution (xo, yo), with yo ~ 0. Indeed, if yo > 0, 
we will obtain n(x) = V1 — x2, whereas if yo < 0, we will take n(x) = —V1 — x2. 

In general, we will show that the same conclusion holds if we take any point 
(xo, yo) for which g(xo, yo) = 0, provided that 5 (x0, yo) ~ 0. In such a case, there 
exists a small neighborhood of (xo, yo) where 


g(x,y) =0 & y=n(x) 
for some function 7, which thus happens to be “implicitly defined.” 


This important result, due to Ulisse Dini, will be later generalized to any finite- 
dimensional setting. 


Theorem 10.13 (Implicit Function Theorem—I) Let O C R x R be an open set 
g: O— RaC'!-function, and (xo, yo) a point in O for which 


0 
g(xo, Yo) =0 and jy 8090) #0, 


Then there exist an open neighborhood U of xo, an open neighborhood V of yo, and 
aC!-function n : U — V such that U x V C O, and, taking x € U andy € V, we 
have that 


gx,y)=0 S&S y=n(r). 


Moreover, the function n is of class C!, and the following formula holds: 
dg ~! ag 
n(x) =—(—@,n@)} On). 
dy ox 


Proof Assume, for instance, that 3 (x0, yo) > 0. By the continuity of ag there is 
ad > 0 such that [xo — 6, x9 + 6] x Lyvo — 46, yo + 6] C O and, if |x — xo| < 6 and 
ly — yo| < 6, then aE (x, y) > 0. Hence, for every x € [xo — 6, x9 + 6], the function 
g(x, -) is strictly increasing on [yo — 5, yo + 6]. Since g(xo, yo) = 0, we have that 


g(xo, yo — 5) < 0 < g(x, yo + 4). 

By continuity again, there is a 6’ > O such that, if x € [xo — 8’, x9 + 6’J, then 
g(x, yo —5) <0 < g(x, yt 4). 

We define U = ]xo— 8’, x9 +6’[, and V = ]yo—6, yo + 6[ . Hence, for every x € U, 


since g(x, -) is strictly increasing, there is exactly one y € ]yo — 4, yo + 6[ for which 
g(x, y) = 0; we call n(x) such a y. We have thus defined a function 7 : U > V 


10.6 Implicit Function Theorem: First Statement 271 
such that, taking x € U andy € V, 


a(x,y)=0 S&S y=). 


To verify the continuity of 7, let us fix a x € U and prove that 7 is continuous at x. 
With x € U and considering the function y : [0, 1] — U x V defined as 


y(t) = (& +t — x), n(&) + t(n@) — n&))), 


the Lagrange Mean Value Theorem 6.11 applied to g o y tells us that there is a 
€ € ]0, 1[ for which 


0 0 
8 (x, n(&)) — 8X, n(X)) = SE) es es 3y 7 EI) — n(x). 


Since g(x, n(x)) = g(x, n(x)) = 0, we have that 


1 
7 ©) 


i) 


SS (E(x — 8) 
XxX 


In(x) — n(@x)| = 5 


Since the partial derivatives of g are continuous and 2 is not zero on the compact 


set U x V, we have that there is a constant c > 0 for which 


1 
[32 (y (€))| 


0 
OE) N(x — x) 
Xx 


<clx —Xx|. 


As a consequence, 77 is continuous at x. 
We now prove the differentiability. Taking x € U and proceeding as previously, 
for h small enough we have 


m&+h)—n(@) _ _ 3e(r@) 
h se (v(E)) | 
with y (€) belonging to the segment joining (x, n(x)) to * +h, n(x +h)). Ifh tends 


to 0, we have that y (&) tends to (x, n(x)), and hence 


n'@) = tim 24M —n@) _ _ ge n@) 
h->0 h 38, n(@)) . 


272 10 The Differential 


This implies that 7 is of class C!, and 


98 (x, n(x) 


32 ,  foreveryx eU. 
$8 (x, n(x) 


n(x) =—-— 


We have thus completed the proof. | 


10.7. The Differential of a Vector-Valued Function 


Let us recall the definition given at the beginning of the chapter. The differential of 
afunction f :O—> R” ata point zo € O isa linear function @ : RY + R” for 
which one can write 


f(x) = f(xo) + (a — ao) +r(x), 
with 


r(@) 


n —_——_— = 
exo ||x — Loll 
This linear function £, when it exists, is denoted by df (xo). 


When M > 2, let f;, : O — R be the components of the function f : O > R”, 
with k = 1,2,..., M, so that 


f(@) = (fi(@), fo(®),..., fu (@)). 


Theorem 10.14 The function f is differentiable at xo if and only if all its 
components are. In this case, for any vector h € RN, 


df (ao)h = (dfi(xo)h, df2(o)h,...,dfu(xo)h). 
Proof Considering the components in the equation 
f(x) = f (xo) + C(@ — a) + r(@), 
we can write 
Sx (@) = fx (Lo) + lk (H@ — Ho) + K(X), 
with k = 1,2,..., M, and we know that 


r(x) ; rk (2) 
————————( = S lim ———— =0 foreveryk = 1,2,...,M, 
«2x0 || — xo|| a—>ao || — xo|| 


whence the conclusion. |_| 


10.7. The Differential of a Vector-Valued Function 273 


The preceding theorem permits us to recover all the computational rules obtained 
in the case M = 1. Moreover, the function f : O — R™ is said to be 
“differentiable” or “of class C!” if all its components are. This definition naturally 
extends to functions of class C”. 

Note that when N = 1, the differential df (xo) : R > R™ is the linear function 
that associates to any h € R the vector 


df (xo)(h) = hdf(xo)(). 


This last vector df (xo)(1) € R™ is called the “derivative” of f at xo and is usually 
denoted simply by f’(xo). Using the preceding definition, one readily sees that 


jG) = tim SO te | 


thereby recovering the definition given in Sect. 7.14 and the usual definition given 
when N = M=1. 

It is useful to consider the matrix associated with the linear function 0 = df (ao) 
given by 


£\(€1) €\(€2) ... €1(en) 
l(€1) €2(€2) ... t2(en) 


£m (e1) €m(E2) ... €u (en) 


where €1, €,...€y are the vectors of the canonical basis of R%. This matrix is 
called the “Jacobian matrix” associated with the function f at a9 and is denoted by 
one of the symbols 


Jf (Xo), f'(&o). 
Recalling that 


2Fk (epg) = df (x0)e; , 
Ox j 


withk = 1,2,..., Mand j = 1,2,..., N, we see that 


af: af: af. 
afm af afu 


Gx, (LO) Fy (Lo) +> Fey (Lo) 


274 10 The Differential 
Remark 10.15 Note that when M = 1, 1.e., when f : O > R, then its gradient is 


V f(@0) = Jf (xo)’, 


the transpose of the row matrix Jf (2p). (Recall that a vector is always a column 
matrix.) 


10.8 The Chain Rule 


We now examine the differentiability of the composition of functions. As usual, O 
denotes an open subset of R, and 29 is a point in O. 


Theorem 10.16 If f : O — R™ is differentiable at xo, while O' C R™ is an open 
set containing f (©) and g : O! — R» is differentiable at f (a0), then go f is 
differentiable at Xo, and 


d(go f)(@o) = dg(f(ao)) odf (xo). 
Proof Setting yo = f (ao), we have 
f(x) = f(@o) + df (xo) (@ — Ho) + r1(#), 
8(Y) = 8(Yo) + dg(Yo)(Y — Yo) + 12(y). 
with 


r\(@) ; ro(y) 


————————_|= im = 
eax |X — Loll yo IIY — Yoll 


’ 


Let us introduce the auxiliary function R2 : O' > RE, defined as 


r2(y) 


———__ ify ZY, 
ly — yoll . 


Roy) = 
0 if y= Yo. 
Note that Ro is continuous at Yq and 
r2(y) = lly— YollRo(y), for every ye O". 
Then 
g(f(x@)) = g(f(xo)) + dg(f(o) Lf (x) — f(xo)] + r2(f(@)) 


= g(f(x0)) + dg(f (xo) [df (xo)(@ — xo) + r1(@)] + r2(f(x)) 
= 3(f(£o)) + [dg (f (@o0)) o df (0) |(@ — Xo) + r3(@), 


10.8 The Chain Rule 275 


where 


3(@) = dg(f(Lo))(ri(@)) + r2(f (@)) 
= dg(f(Lo)) (1 (&)) + I f(@) — f(@o)||Ra(f (@)) 
= dg(f(Xo0))("1(@)) + lla f (#0) (@ — Bo) + r1(@)||Ro(f(@)) . 


Hence, 
I|r3 (a) || r\(@) 
—_ d 
Jz —xol ~ | sree (FE Zo z)l+ 
+ ([arcen (== eel )|+ fatal HO) Rac fal 
—gol) | Ia — 


We can see that all this tends to 0 as x — 2o. Indeed, if x tends to a9, the first 
summand tends to 0, since dg(f(20)) : RN — R* is linear, hence continuous, and 


r\(@) 


a i: (10.1) 
azo ||% — £ol| 


On the other hand, since f is continuous at Zp and R2 is continuous at yy = f (Lo), 
with R2(Yo) = 0, we have that 


lim Ro(f(@))|| = 0. 


Finally, since df (0) : RY -— R” is linear, hence continuous, it is bounded on the 
compact set B(O, 1), by Weierstrass’ Theorem 4.10. Therefore, using also (10.1), 


({ar (2 =) 4 ne is bounded 
la — ao] la — xo] 


Summing up, 


IIr3 (@) | 
e>2xo |x — Loll 


’ 


and we can conclude that g o f is differentiable at zo, with differential dg(f(ao)) 0 
df (x0). a 


It is well known that the matrix associated with the composition of two linear 


functions is the product of the two respective matrices. From the preceding theorem 
we then have the following formula for the Jacobian matrices: 


J(go f)(@0) = Je(f(Ho))- Jf (xo) ; 


276 10 The Differential 
this means that the matrix 
a a 
gop (x ) saat gop (x0) 
Meee Du (a0) oe Hee DL (a9) 


is equal to the product 


3 (f (ao) «++ FECF(@o))\ ( ZA (@o) --- ZA (ao) 
ECF (wo) «+ ECF 0) ) \ 4 (wo) + 4 (@o) 


We thus obtain the formula for the partial derivatives of the composition of 
functions, usually called the chain rule: 


d(g 0 fii 
Ox j 


02g; 0g; 0 
= Sf ao) (ao) a Sf ao) (a) rr + SE (Feo eo) 
YM Ox j 


(xo) = 


ag 
= Seve ao), 


where i = 1,2,...,L andj =1,2,...,N. 


Remark 10.17 When L = 1, i.e., when g : O! > R, in view of Remark 10.15 we 
obtain the formula 


V(go f)(ao) = Jf(xo)’ Va(f(xo))- 


Let us now prove the following generalization of the formula for the derivative 
of a product of two functions. 


Theorem 10.18 Let f : O — R™! and g : O — R® be two functions, 
differentiable at some x. Let F : O — R* be defined as 


F(x) = B(f(&), g(@)), 


where B : RN! x RN2 — R?® is a bilinear function. Then, for every h € RN, 


dF (xo)h = B(df (ao)h, g(x0)) + B( f (ao), dg(xo)h) . 


10.9 Mean Value Theorem 277 


Proof First define the function g : O > R™! x RY? as g(a) = (f(x), g(a)), and 
note that dy(ao)h = (df (ao)h, dg(ao)h) for every h € RY. Then it is sufficient 
to apply Theorem 10.16, in view of Proposition 10.7. a 

The following two examples with O C R, involving the scalar product and the 


cross product of two functions, are direct consequences of the preceding formula. 
Assume that f, g : O > R™ are differentiable at some xo € O. Then 


(f - 8)'(xo) = f' (x0) - 80) + f 0) - 8/0). 


Moreover, if M = 3, then 


(f x g)'(xo) = f’(xo) x g(xo) + f (x0) x g(x0)- 


10.9 Mean Value Theorem 

Lagrange’s Theorem 6.11 does not extend directly to functions having vector values. 
For example, taking a = 0 and b = 2z, the function f : [a,b] > R? defined as 
f(x) = (cosx, sin x) is such that f(b) — f(a) = (0, 0), but there is no € € Ja, b[ 
for which f(b) — f(a) = f’(€)(b — a) since f’(€) = (—sin&, cosé) # (0, 0). 
We will nevertheless try to find a substitute for this theorem, which will be useful in 


what follows. 
We first need the following lemma. 


Lemma 10.19 Let ¢ : [a,b] > R™ be a differentiable function for which there is 
a constant C > 0 such that 


lo’ @ || < C, foreveryt € [a,b]. 
Then 
Ilp(b)-— g@|| < C—a). 
Proof We set Ip = [a, b]. Assume by contradiction that 


lp) —e@||_-Cbo-a=u>0. 


We divide the interval [a, b] into two equal parts, taking the midpoint m = oe 
Then it can be seen that one of the two following inequalities holds: 
LL LL 
lle@n) — p{a@)|| -— Cm —a)> —, lle@) — eGa)ll -—C—m) > —. 


278 10 The Differential 


If the first one holds, we set J; = [a, m]; otherwise, we set J; = [m, b]. In the same 
way, we proceed now to the definition of /2, then J3, and so on. We thus obtain a 
sequence of compact intervals J, = [dn, by], with 

b2h2h2kheOQ... 


such that 


lon) — (an) || — Cn an) = 


for every n € N. By Cantor’s Theorem 1.9, there is ac € R such that a, < c < by 
for every n € N, and since 


b-—a 
Dy On = 


we have that lim, ad, = lim, b, = c. Since ¢ is differentiable at c, we can write 
g(t) = ge) + ¢'(C)\t—c) +r), 
with 


_ rt) 
lim = 
t>c t{-C 


0. 


Let e € ]0, 7 zl. If n is sufficiently large, we have 


He <2"(lebn) — P@n)ll — COn — an)) 
< 2" (len) — PI + Ie) — GG@n)|| — Cn = an)) 
= 2" IIe’ (C)(n — €) + rn) Il + Ile'(C)(G@n — ©) + r(Gn)Il — C@n — an) 
< 2" (Ile’ Cll bn — el + Ir Gall + gO lan — el + Ir G@n)Il — Cn — an)) 
< 2"(C(bn — c) + [Ir Gn) || + Cle = an) + [Ir Gn) || — Cn — an) 
= 2" (Ilr all + Ilr Gl) 
< 2"(e|bp — c| + Elan — cl) = 2"e(bn — an) = e(b —), 


a contradiction, which completes the proof. | 


It will now be useful to introduce the norm of a linear function. A : RY > RY” 
as 


|| Al = max{|| Aa] : |x|] = 1}. 


10.9 Mean Value Theorem 279 


Such a maximum exists by Weierstrass’ Theorem 4.10 since the function A, being 
linear, is continuous. The reader might like to check that we have indeed defined a 
norm, verifying the following properties: 


(a) |All 2 0. 
(b) ||AJ =0 & x=0. 
(c) lla All] = la! |All. 
(d) |A+ A SIA +141. 
Moreover, we have that 
|| Az|| < |All lll, for every a ¢ RN. 

We are now ready to state our extension of Lagrange’s Mean Value Theo- 

rem 6.11. Let O be an open set in R, and let f : O > R” be a differentiable 


function. 


Theorem 10.20 (Mean Value Theorem) /f [ao, &] is a segment contained in O, 
then 


I f(@) — f@o)ll < sup {lldf @)I| : v € [ao, w]} ae — woll. 


Proof If the supremum is equal to +00, there is nothing to be proved. Suppose, 
then, that 


sup {Ild fo) uv € [ao, | —CeR. 


We consider the function ¢ : [0, 1] > IR”, defined as g(t) = f(xo + t(x — Xo)). 
Then 


lvl = lid feo + t(@ — xo))(x — £o)I 
< |ldf (@o + t(@ — Xo))I| [lw — oll 


< Clla — xoll 
for every ¢ € [0, 1]. By Lemma 10.19, 
Il f(a) — f(xo)ll = eC) — gO)|| < Clla — xo|| — 0) = Cla — xo}, 


which is exactly what we wanted to prove. a 


280 10 The Differential 


10.10 Implicit Function Theorem: General Statement 


We will now generalize the Implicit Function Theorem 10.13 in its general finite- 
dimensional context. Let O be an open subset of R” x RY andg: O > RN a 
C!-function. Hence, g has N components 


g(@, Yy) = (g1(@, y),.-., 8n(@, Y))- 


Here @ = (x,...,xm) € R™, and y = (y1,---, YN) € IR. We will use the 
following notation for the Jacobian matrices: 


F Br (Bs YW) = Gear (@, Y) 
& ‘ : 
5 (@, y) = sa ; 
Be (ay) EX (@, y) 
as as 
i Ip (LY) > FH (@,Y) 
Ta 


N(x, y) «++ 524 (ae, y) 


Theorem 10.21 (Implicit Function Theorem—II) Let O C R™” x RN" be an 
open set, g: O — RN aC!-function, and (a, Yo) a point in O for which 


0 
g(20, Yo) =9 and det iy to Yo) #0. 


Then there exist an open neighborhood U of Xo, an open neighborhood V of Yo; 
and a C!-function n : U —> V such that U x V © O, and, taking x € U and 
y € V, we have that 


s@.y=0 S y=n(@). 


Moreover, the function n is of class C!, and the following formula holds true: 


a =a 
In(x) = — (He, n(a)) (ee, n(a)). 


Proof In the case where N = 1, the definition of 7 is almost the same as the one 
given in the proof of Theorem 10.13. It will be sufficient to replace the interval 
[xo —6, x9 +6] with the ball B(ao, 5) and to replace ]xo —6’, x9 +6’[ with B(ao, 6’). 
Once the function 7 : U — V has been defined, let us see how to prove its continuity 
and its differentiablility. 


10.10 Implicit Function Theorem: General Statement 281 


To verify the continuity of , let us fix a x € U and prove that 7 is continuous at 
x. If we take x € U and consider the function y : [0, 1] ~ U x V, defined as 


y(t) = (@+t(x@ — &), n(@) + t(n(@) — n(@))), 


Lagrange’s Mean Value Theorem 6.11 applied to g o y tells us that there isa € € 
]0, 1[ for which 


_ _ 0 = 0 _ 
g(a, n(@)) — g(&, n(®)) = so Ea ~%)+ 7 7 EM) ~ 1(@)). 
Since g(%, n(@)) = g(%, n(@)) = 0, we have that 


In(w) — n@)| = = | (V(E)(@ — a). 
(I 


Since the partial derivatives of g are continuous and oe is not zero on the compact 


set U x V, we have that there is a constant c > 0 for which 


— vene- ©)| < clla— x]. 
207 ©) 
As a consequence, 77 is continuous at 2. 
We now prove the differentiability. Taking @ = (x1, X2,...,xy), let@ = (x4) + 
h, X2,...,Xm); proceeding as previously, for / small enough we have 
n(x1 +h, X2,...,%M) — N@1, %2,--.,4M) _ _ m7) ae (7 E)) 
h (vy) | 


with y (€) belonging to the segment joining (&, n(a)) to (a, n(a)). If h tends to 0, 
we have that y (€) tends to (a, n(a)), and hence 


Oy = tiny TELA Bs Fas Ean) = Eas Fay Eat) _ = (&, (&)) 
0x4 ee h si (@, n(&)) 
The partial derivatives with respect to x2,...,x, are computed similarly, thereby 


yielding that 7 is of class C! and 


1 ) 
Jn(2) = ————————_ Sg, n(x)) foreveryxeU. 


3 (a, n(a)) Ox 


We have thus proved the theorem in the case N = 1. 


282 10 The Differential 


We now assume that the statement holds till N — 1 for some N > 2 (and any 
M > 1) and prove that it then also holds for NV. We will use the notation 


Yi = O1,---, YN-1); 


and we will write y = (y,, yn). Since 


si +(®0, Yo) ° - pi ~(&0, Yo) 
det : ar : #0, 


sean Yo) «++ 7 X (a9, Yo) 


at least one of the elements in the last column is different from eat We can assume 


without loss of generality, ee changing the rows, that 4 Fey (LO, Yo) # 0. 
Writing Yo = (y!, Wes with y= one ses Hee); we then ae 


OgN 
gn (Ho, y?, yy) =0 and Fy (PO y?, yt) £0. 


Then, by the already proved one-dimensional case, there are an open neighborhood 
U, of (0, y’), an open neighborhood Vy of Pore and a C!-function n,: U,; > Vu 
such that U; x Vy C O, with the following properties. If (7, y,) € Ui and yy € 
Vy, then 


en(L,Yj,yn)=90 & yn=m (L,Y) 


and 


1 OgNn 


Re yin@uy Hew 


Jn (L,Y) = — (2, Y1,m(&, yy)). 


We may assume that Uj is of the type U “ Vi, with U being an open neighborhood 
of 2% and V7; an open neighborhood of y". 
Let us define the function ¢@ : EVs by setting 
P(L, Yr) = (81(@, Yr, M(H, Y1)), ---, SN-1(@, YW, M(#, Y)))- 
For brevity’s sake we will write 
8(1,....N-1)(@, Y) = (81(@, Y),---, 8n-1(@, Y)), 


so that 


b(@, Y}) = 8(1,....N-1) (2, Yy, M(B, Y})) - 


10.10 Implicit Function Theorem: General Statement 283 


Note that, since 1 (20, y!) = ee we have that 


6(&0, Yl) = 8u.....N—1) (20, Yo) = 0 
and 


0 fe 
98C,.N=D. (xo, M0) (@, yi). 


(10.2) 


ag 0 O8(1,... 
>= (£0, yy) = (20, Yo) + ——————- 
OY, OY] 0, Yo YN 


Moreover, since gy(X, Y;, 1 (X, y,)) = O for every (x, y,) € U1, differentiating 
with respect to y, we see that 


0 
0= ay ©, Yo) + a 0 Yo) seo, Y). (10.3) 


Let us write the identity 


To) 0,| 98(1,...N—-1) 
=— (Xo, y})| ——— (20; Yo) 
OY, = OYN a 
a 1 
det” ey. Y}) = = et 
ay 364 (229, Yo) ; 
° N 
0 SEN (a9, Yo) 
OYN 
Substituting the two equalities (10.2) and (10.3), we have that 
fo) 0,! 98d,....N-1) 
ay, (Zo, Y}) in (Lo, Yo) 
det = 
O2N 
0 ~oN 
Si (Lo, Yo) 
O8(1,....N—D N-1) any 0,| 98(1,...N—D 
a (209 ae Lo, Yo) — (Xo, Lo, 
(Lo, Yo) ae (Lo Yo) ay, | 0. Y)) ay (Lo, Yo) 
=det 
OgN 
3, Yo) + 3ee = (®o, Boge yo y}) yy oo Yo) 


geal oe oni oy| 98 
= aer( (a0, Yo) + = (0, Bos ay, 7M (ao, Y 0) an (0.4) ) 


0 
= det] Eo. wo) + ($e —— (x 0 Yo) gue (Bo. Y )) eo, wo) )| 


284 10 The Differential 


We now recall that adding a scalar multiple of one column to another column of a 
matrix does not change the value of its determinant. Hence, since 


0 
(= (ao, Yo Taos y?) Fac(@o. Yo) ) = 
dyn 


081 0 ogi agi 

se (ao, Wo) 5 7 (a9, Y) o- Feo. Bod (0, Y}) Byy (To Yo? 
a a 0 

EN (29, Yo) —— (0,9) aw 0s ws ~ (0, Yi % SN (a9, Yo) 
dyN dy1 dyn 


it must be that 
0g 0g ont 0! 98 
det} —(2o, Yo) + (Seow )— (Xo, Y})| ——(o, Yo) 
Lage 20)+ (Gy ar aaa ea 
ag 
= det — (Xo, F 
€ ay 0> Yo) 


Thus, 


ap 


ag 
det —— (ao, y®) = t— (20, , 
€ ae 0: Y}) e ay 0, Yo) 


ag 
Ty (LO, Yo) 


and finally we have 


0 
b(wo,yi) =O and der 2 (ap, y}) £0. 
1 


By the inductive assurmplen, there are an open neighborhood U of ao, an open 
neighborhood Vj of y!, and a C!-function nz: U — V, such that U x Vj © Ux Vi, 
and the following holds: For every x € U and y; € Vj, 


o(@y)y=9 S&S Y=m(e). 
In conclusion, for 2 € U and y = (yj), yy) € Vi x Vu, we have that 


B(1,....N—-1) (@, Y, yn) = 0 


(Zy=0 <> | 
ae 8n(&L, Yi, yn) = 0 


a | 8(,....N-1) (@, Yj, yn) = 9 
Yn = m1(@, Y}) 


10.10 Implicit Function Theorem: General Statement 285 


aN | p(x, y,)=9 


Yn = (2, Y}) 
Y, = n2(@) 
Yn = (2, Y}) 


> Y= (2(&), m1(@, n2(H))). 
Setting V = V; x Vy, we may then define the function n : U — V as 


n(a@) = (n2(@), m1 (@, 2(@))) . 


This function is of class C! since both n, and 72 are as well. Since g(x, n(x)) = 0 
for every & € U, we easily deduce that 


0 0 
= (x, n(w)) + 2 (a, n(x) Jn(a) = 0, 
x oy 


whence the formula for Jn (x). | 


Clearly, the following analogous statement holds true, where the roles of x and 
y are interchanged. 


Theorem 10.22 (Implicit Function Theorem—III) Let O C R” x R® be an 
open set, g: O > R™ aC!-function, and (x0, Yo) a point in O for which 


0 
g(@0, Yo) =0 and det — (wo, Yo) #0. 
Then there exist an open neighborhood U of ao, an open neighborhood V of Yo, 
and a C!-function n : V — U such that U x V © O, and, taking x € U and 


y € V, we have that 


szy=09 S&S L=n(y). 


Moreover, the function n is of class C', and the following formula holds: 


a aly 
In(y) = - (Tc. v) 7y y). 


286 10 The Differential 


10.11 Local Diffeomorphisms 

Let us introduce the notion of “diffeomorphism.” 

Definition 10.23 Given A and B, two open subsets of R , a function g:A>B 
is said to be a “diffeomorphism” if it is of class C!, it is a bijection, and its inverse 


gy |: B > A isalso of class C!. 


Let us state the following important consequence of the Implicit Function 
Theorem. 


Theorem 10.24 (Local Diffeomorphism Theorem) Let A be an open subset of 
RY, and let g : A — RN be aC!-function. If, for some 29 € A, we have that 
det Jg(xo) € 0, then there exist an open neighborhood U of Xo contained in A 


and an open neighborhood V of g(a) such that p(U) = V, and the restricted 
Junction %|yp:U—> Visa diffeomorphism. 


Proof We consider the function g : A x R“ —> R% defined as 


Setting Yo = p(Xo), we have that 


P 
(20, Yo) =0 and det (ao, Yo) = det Jy(ao) £0. 


By the Implicit Function Theorem 10.22, there exist an open neighborhood V of 
Yo, an open neighborhood U of ao, and aC!-function 7 : V > U such that U C A 
and, taking y € V anda € U, 


g@e=y & g@y=0 & L=n(y). 
Hence, 7 = O53 and the proof is thus completed. a 
The following corollary will be useful. 
Corollary 10.25 Let A © R% be an open set ando : A > RN an injective C'- 
Junction such that det Jo (ax) 4 0 for every x € A. Then the set B = o(A) is open, 
and the function g : A — B defined as p(x) = ao (&) is a diffeomorphism. 
Proof For every yp € o (A) there is a unique #9 € A such that o (2%) = Yo, and we 
know that det Ja(xo9) 4 0. Hence, by the Local Diffeomorphism Theorem 10.24, 


there exist an open neighborhood U of Xo contained in A and an open neighborhood 
V of Yo such that o(U) = V, and the restricted function Oly: U > Visa 


10.12 M-Surfaces 287 


diffeomorphism. Then V = o(U) C o(A), thereby proving that o(A) is an open 
set. 

In conclusion, the function g : A — o(A) defined as g(a) = o(2&) is bijective 
and of class C!, and, being a local diffeomorphism, its inverse y~! : o(A) > A is 
of class C! as well. | 


We now derive the formula for the differential of the inverse function. 


Theorem 10.26 Let g : A > B bea diffeomorphism, take xo € A, and let yo = 
¢(x0). Then dy(ao) is invertible, and 


do™' (yo) = dg(ao)'. 
whence 


J9~"(Yo) = Je (ao) 
Proof Observe that g~! og: A — A is the identity function J on A, andgog™!: 
B — B is the identity function J on B. Then their differentials at ao and at Yo, 
respectively, are also identity functions, and hence 


1 =d(y_! 0 ¢)(a) = dg |(y(ao)) 0 dy(ao) = dy !(yo) 0 dg(xo) 
T =d(yog')(Yo) = de(y!(yp)) 0 dg" (Yo) = dp(ao) o dg”! (yp). 


This proves that dy(ao) is invertible, and dy~! (Yo) is its inverse. The equality for 
the Jacobian matrices is a consequence of the fact that the matrix associated with 
the inverse of a linear function is the inverse of the matrix of that linear function. 


10.12 M-Surfaces 


We often hear talk of “curves” and “surfaces” without a precise definition of what, 
in fact, they are. We now begin examining these objects from a dynamical point 
of view. The motivation comes from a typical situation in physics when one wants 
to describe the trajectory of a moving object. Assuming that the object is a point, 
surely enough we would not be satisfied if we were only told, for example, that its 
trajectory was a circle. We would also like to know how the object moves on this 
circle: Is its speed constant or varying? Is it moving clockwise or counterclockwise? 
Or is it oscillating back and forth? 

To satisfy the need to know precisely how the object is moving, we introduce a 
function, defined on some interval [a, b], which to each instant of time ¢ € [a, b] 
associates its position in space, say, o(t). Such a function o : [a, b] > RY if it is 
sufficiently regular, will be called a “curve” in RY. 


288 10 The Differential 


Similar observations can be made for a “surface,” which will be a function 
defined on some rectangle [a1, bj] x [a2, b2]. (The choice of a rectangular domain 
is made for simplicity.) These two situations will now be generalized to an arbitrary 
dimension M, leading to the concept of “M-surface.” 

We denote by / a “rectangle” in R™, i.e., a set of the type 


I= [a,b\] x --- x lay, bu). 


This word is surely familiar in the case of M = 2. If M = 1, arectangle happens to 
be a compact interval, whereas if M = 3, we usually prefer to call it a “rectangular 
parallelepiped” or “cuboid.” 


Definition 10.27 Let 1 < M < N. We call “M-surface” in R% a functiono : | > 
RY of class C!. If M = 1, then o is also said to be a “curve”; if M = 2, then we 
will simply call it a “surface.” The set o (/) is the “image” of the M-surface o. We 
will say that the M-surface o is “regular” if, for every w € 1, the Jacobian matrix 
Jo(wu) has rank M. 


A curve in R% is a functiono : [a,b] > RY, with 
o(t) = (o1(t),..., on (t)). 
The curve is regular if, for every t € Ja, b[ , the vector o’(t) = (o;(t),..., oy (t)) 
is different from zero, i.e., o/(t) # (0,..., 0). In that case, it is possible to define 


the “tangent unit vector” at the point o (ft), 


o'(t) 


To (t) = 


lo") 


ae o(t) + 2,(t) 


o(t) 
Example The curve o : [0, 277] + R°, defined by 


o(t) = (Reos(2t), R sin(2t), 0), 


10.12 M-Surfaces 289 


has as its image the circle 
{,y, 2) x7 + y? = R*, z= 0} 


(which is covered twice). Since o’(t) = (—2R sin(2t), 2R cos(2r), 0), it is a regular 
curve, and 


Tg (t) = (— sin(2t), cos(2t), 0). 
A surface in R? is a functionc : [a,, by] x [az, bo] > R°?. The surface is regular 
if, for every (u, v) € Jay, bi[ x Jaz, bo[, the vectors 92 (y, v), 82 (y, v) are linearly 
independent. In that case, they determine a plane, called the “tangent plane” to the 


surface at the point o (uw, v), and it is possible to define the “normal unit vector” 


9 (u,v) x 42 (u, v) 


LAk. = ———._ 3 
Seu, v) x $2 (u, v)II 
which is visualized in the following figure. 


o(u,v) + v,(u,v) 


v,(u,v) 


Example I The surface o : [0, 2] x [0,7] > R°?, defined by 
o(¢,0) = (Rsingcosd, Rsingsin§, Rcos@), 
has as its image the hemisphere 


{ay 2:x7+y? +27 = R*, y >}. 


290 10 The Differential 


Since 


a 
ag? 0) = (Rcos¢cosé, Rcos¢sind, —R sing), 


3 
spe 9) = (—Rsing sind, R sind cos6, 0), 


we compute 


do 


do PIPERS D222) * Des 
ap Pr? x Paes sin’ d cos 0, R* sin* @ sind, R“ sing cos@). 


We thus see that it is a regular surface, and 
Vo(d, 0) = (sing cos8@, sing sind, cos¢@). 
Example 2 The surface o : [0, 277] x [0, 27] > R°?, defined by 
o(u,v) = ((R+rcosu)cosv, (R+rcosu) sinv,r sini) , 


where 0 <r < R, has as its image a torus 


2 
{ere (2-49?) rear 


Even in this case, one can verify that it is a regular surface. 


10.13. Local Analysis of M-Surfaces 291 


A 3-surface in R? is also called a “volume.” 
Example The function o : [0, R] x [0, x] x [0, 277] — R?, defined by 
o(p,¢,9) = (psingcos6, psing sind, pcos¢), 
has as image the closed ball 
(x,y, Dix? +y? +27 < R}. 


In this case, det Ja (p, 6, 0) = p” sing, so that it is a regular volume. 


The best way to describe a set .@ in R% is to find a parametrization. Let us 
explain precisely what this means. 


Definition 10.28 An M-surface o : I — RY is an “M-parametrization” of a set 
Md if it is regular and injective on J and o (1) = .@. We say that a subset of R% is 


“M -parametrizable” if there is an M-parametrization of it. 


Examples The circle .@ = {(x, y) € R* : x7 + y* = 1} is 1-parametrizable, and 
o : [0,27] > R?, given by o(t) = (cost, sinf), is a 1-parametrization of it. 


A 2-parametrization of the sphere ./ = {(x, y,z) € R?:x*7 + y* +27 = lhis, 
for example, o : [0, 7] x [0, 277] > R}, defined by 


o(¢, 0) = (singcos8@, sing sind, cos @) . 


10.13 Local Analysis of M-Surfaces 


Sometimes geometrical objects are given by an equation like y = x? (a parabola) or 
xe y? = | (acircle) or x2 + y? +227=1(a sphere). We will now show that, under 
reasonable assumptions, these kinds of objects can be locally described by a curve, 


292 10 The Differential 


a surface, or, in general, an M-surface, which, we recall, is aC ' function defined on 
a rectangle, with values in RY. We now assume | <M<N. 
We thus have in mind a geometrical object described by an equation like 


g(a) =0. 


We will focus our attention at a point 2% and describe locally our object by some 
C!-function defined on some rectangle of the type 


Bir] = [-r,r] x---x[-r,r]. 


Theorem 10.29 Let O C RN be an open set, x9 a point of O, and g : O > RN-” 
a function of class C! such that 


g(@0) =0, and Jg(xo) has rank N—M. 


Then there exist a neighborhood U of xo and a regular and injective M-surface 
o : B[r] > RN for some r > 0 such that o (0) = xo and 


{a €U: ¢(@) =0}=o(BIr)). 


Proof Assume, for instance, that the matrix 


agi ogi 
4 cs 0): dxN (Xo) 
& 
aq 0) = : : 
dgN-—M dgN-—M 
OXM+1 coe OXN (Xo) 


is invertible. (If not, it will be sufficient to shift the columns of the matrix Jg(2o) 
to reduce to this case.) Let us write each w € O as (&, @), where & € R™ and 
az € RN-™. Since 


a Og , n 
g(@o,%o) = 0 and det <= (0, Bo) #0, 


by the Implicit Function Theorem 10.21, there exist an open neighborhood U of Xo, 
an open neighborhood U of & &o,and aC ‘function 7 : U — UsuchthatUxU CO, 
and, taking @ € U and & € U, we have that 


e(@2,2)=0 & L=7n(2). 


Let r > 0 be chosen such that B[Zo,r] C U, let U = Blo,r] x U, and let 
o : Bir] > RY be defined as o (uw) = (u+ Xo, n(w+ Zo)). Since Jo (uw) has as a 


10.13. Local Analysis of M-Surfaces 293 


submatrix the identity M x M matrix, surely o is regular. Moreover, o is injective 
since its first component wt> u + Xo is. Finally, if ¢ = (&@, @) € U, then 


g(@,2)=0 @ L=n(%) & (&, 2) =o(&— 0), 
yielding the conclusion. | 
The M-surface o appearing in the statement of the previous theorem is called a 
“local M-parametrization.” 
Let us analyze in greater detail three interesting cases. We start by considering a 


planar curve, i.e., the case M = 1, N = 2. 


Corollary 10.30 Let O C R? be an open set, (xo, yo) a point of O, andg: O > R 
a function of class C' such that 


g(xo, 0) =0 and Vg(x0, yo) £9. 


Then there exist a neighborhood U of (xo, yo) and a regular and injective curve 
o:[-r,r] > R? for some r > 0 such that o (0) = (xo, yo) and 


{@,y) €U: g(x,y) = 0} =o([-r,r]). 
Let us now examine the case of a surface in R?, i.e., the case M = 2, N = 3. 


Corollary 10.31 Let O © R? be an open set, (x0, yo, Z0) a point of O, and g : 
O — Ra function of class C! such that 


g(xo, Yo, Z0) =O and Veg(xo, yo, z0) #0. 


Then there exist a neighborhood U of (xo, yo, Zo) and a regular and injective surface 
o:[—-rr] x [—-r,r] > R? for some r > 0 such that o (0, 0) = (x0, yo, Zo) and 


{(x, y,z) €U: g(x, y,z) = 0} = o((-7,r] x [-r,r]). 
We conclude with the case of a curve in R?, i.e., the case M = 1, N = 3. 


Corollary 10.32 Let O C R? be an open set, (xo, yo, 20) a point of O, and g1, g2: 
O — R two functions of class C', such that 


81(X0, Yo. Z0) = 82(X0, yo, 20) =9 and Vagi(x0, yo, Z0)xVg2(x0, yo, Zo) #9. 


Then there exist a neighborhood U of (xo, yo, Zo) and a regular and injective curve 
o :[(-r,r] > R? for some r > 0 such that o (0) = (Xo, yo, Zo) and 


{@,y,2) €U : g1(%, y, 2) = g2l%, y, z) =O} =a([-r,r]). 


294 10 The Differential 


10.14 Lagrange Multipliers 


We are now interested in finding local minimum or local maximum points for f 
when its domain is constrained to a set defined by some vector valued function g. 


Theorem 10.33 (Lagrange Multiplier Theorem) Let O C RN be an open set and 
x a point of O. Let g: O > RN~ be a function of class C! such that 


g(@o) =0, and Jg(Xo) has rank N—-M, 
and let f :O — R be differentiable at xo. Setting 
S={xeO: g(x) =0}, 

if 20 is either a local minimum or a local maximum point for cae (the restriction of 
f to S), then there exist (N — M) real numbers 4, ..., 4n—m such that 

N-M 

Vf (ao) = D> AjVgj(@o). 

j=l 

The numbers Aj, ..., ANw—w are called “Lagrange multipliers.” 


Proof By Theorem 10.29, there exist a neighborhood U of xo and a regular and 
injective M-surface o : B[r] > RY for some r > 0 such that o (0) = ao and 


SOU =o(BIr)). 


Consider the function F : B[r] — R defined as F(w) = f(o(w)). Then 0 is either 
a local minimum or a local maximum point for F, hence V F(0) = 0, i.e., 


0 = JF(0) = Jf(x0) Jo (0) = Vf (a0)! Jo(0). 
Asa consequence, 


0 0 
V f (ao) - i) =0,...,V (ao): ag) =0, 


V f (Xo) is orthogonal to 80 0) its ° @) F 
Ou duM 


10.14 Lagrange Multipliers 295 


Moreover, since g(o(w)) = 0 for every u € B[r], we have that 
Jg(xo)Jo(0) =0, 


hence, the vectors 


0 C) 
Vgi(@o), .-., Vgn—m (20) are all orthogonal to £? (0) Satay es 

oul dum 
By assumption, Jo (0) has rank M, ie., 


the real vector space J generated by 


0 
—(0),..., ~? 0) has dimension M. 
Ou dum 


Therefore, the orthogonal space T+ has dimension N — M. The vectors V g1 (Zo), 
...» Vgn—m (20) are linearly independent and, as we saw earlier, they belong to 
T+, so these vectors form a basis for 7+. Since V f (ao) also belongs to T+, it 
must be a linear combination of the vectors of the basis. | 


As in the previous section, we analyze in detail three interesting cases. We start 
by considering the case M = 1, N = 2. 


Corollary 10.34 Let O © R? be an open set and (xo, yo) a point of O. Let g : 
O — R be a function of class C' such that 


8(x0, Yo) =0, and Vg(xo, yo) #9, 
and let f :O —> R be differentiable at (xo, yo). Setting 
S={(x,y) €O: g(x, y) = 0}, 


if (x9, yo) is either a local minimum or a local maximum point for f\s, then there 
exists a real number x such that 


V f (Xo, Yo) = AV (Xo, Yo) - 
Example Among all rectangles in the plane with a given perimeter p, we want to 


find those that maximize the area. Let us denote by x and y the lengths of the sides 
of a rectangle and define the area function 


f@,y)=xy. 


296 10 The Differential 


We are looking for the maximum points of the function f over the set 
K ={(x,y) € R*:x>0, y>0, 2x+2y=p}. 
This set is compact, so that f, being continuous, surely has a maximum point in K. 


Taking (x, y) € K, note that f(x, y) = 0 only when x = 0 or y = O; otherwise, 
f(x, y) > 0. Define now the function 


g(x,y) = 2x +2y—p. 
Then 
Vi(x,y) =AVewx,y) & (y,x) =AQ, 2). 


By the preceding considerations and Corollary 10.34, the maximum point (xo, yo) 
must be such that x9 = yo; hence, the rectangle must be a square. 


Now we move to the case M = 2, N = 3. 


Corollary 10.35 Let O C R? be an open set and (xo, yo, Zo) a point of O. Let 
g: O > R bea function of class C! such that 


g(x0, Yo, 20) =0, and Vg(xo, yo, 20) #9, 
and let f :O — R be differentiable at (xo, yo, Zo). Setting 
S={(x,y,z)€O: g(x, y,z) =O}, 


if (x0, Yo, 20) is either a local minimum or a local maximum point for f\s, then 
there exists a real number i such that 


V f (x0, Yo, 20) = AVE (x0, Yo, Z0) - 
Example Among all cuboids with a given area a, we want to find those that 


maximize the volume. Let us denote by x, y, and z the lengths of the sides of a 
cuboid, and define the volume function 


f@, y,Z) = xyz. 
We are looking for the maximum points of the function f over the set 
K={(,y,2 €R?:x>0, y>0,z>0, 2xy +2xz4 2yz =a}. 


Taking (x, y,z) € K, note that f(x, y,z) = 0 only when x = Oor y = Oor 
z = 0; otherwise, f(x, y, z) > 0. Everything then seems as in the previous example, 


10.14 Lagrange Multipliers 297 


but there is a difficulty now. The set K is unbounded, hence not compact, and the 
argument of the previous example needs to be modified. First of all we note that 


(EVE en a (ER R=O” 


Hence, if (xo, yo, Zo) is a maximum point of f on K, it must be that 


f (x0, Yo. Z0) = (Gy. 


We now prove that it must be that 


x rece 0< Se iee G2 ea ee 
Sx*085 2° Syos are =.40,= 2 


ie us prove the first one, the others being analogous. By contradiction, if x9 > 


3,/ =, then, since 2x9 yo < a and 2xozo < a, it must be that 
i a 2 ae a21 [2 eg 
X0YOZ =z — <—7,/—=(- ; 
OO day aa ga NG 
3/2 
in contrast to f (x9, yo, Zo) = (¢ ) 


We can then restrict the search of a maximum point of f on the set 


7 3 3 [3 
R=l@yoeR: O<x<3/>, O<y<3/>,0<253 — 


2xy+2xz+2yz= a ; 
which is compact; hence, the point of maximum exists. Now define the function 
g(x, y,Z) = 2xy+2xz+2yz-a. 
Then 
V(x, y, 2) =AVE(x,y,2Z) (yz, xz, xy) = A(2y + 2z, 2x + 2z, 2x+2y). 
If (x, y, z) solves the preceding equation with x > 0, y > 0, and z > 0, then 


x(y+z)=y(x+z)=2(x+y), 


298 10 The Differential 


and hence x = y = z. By the foregoing considerations and Corollary 10.35, the 
maximum point (x9, yo, Zo) must be such that x9 = yo = Zo, so the cuboid must be 
a cube. 

Let us conclude with the case M = 1, N =3. 


Corollary 10.36 Let O C R? be an open set and (xo, yo, Zo) a point of O. Let 
g1, 2 : O — R be two functions of class C! such that 


81(X0, Yo, Zo) = g2(X0, Yo, 20) =O and Vgi(xo, yo, Z0) x Vg2(xo, yo. Zo) #0, 
and let f :O — R be differentiable at (xo, yo, Zo). Setting 
S={(x,y,z)€U: gi(x, y,z) =90, go(x, y,z) = OF," 


if (xo, yo, Zo) is either a local minimum or a local maximum point for f\s, then 
there exist two real numbers 1, 42 such that 


V f (x0, Yo. Zo.) = A1 Vgi(xo, yo, Zo) + A2Vg2(Xx0, Yo, Z0)- 


Example We want to find the minimum and maximum points of the function 
f(x, y, Zz) =z on the set 


Sa{@yJeR x+y 4t2sl xtytzal. 


Note that S is compact, so the minimum and maximum of f on S exist. Define 
gi(x, y,Z) =x* + y? + 24 —l1and g(x, y, 2) =x +y+2z—1. Then 


Vgi(x, y, Zz) = (2x, 2y,2z), Veor,y,2)=C,1,), 
hence 
Vagi(x, y,Z) x Vgo(x, yz) = (2(y — z), 20 — x), 2% — y)). 
Note that 
Veix,y2xVeay2j=0 & x=y=z, 


which implies that (x,y,z) ¢ S. Now, a simple computation shows that if 
(x, y, z) € S, then we have that 


Vf, y,2) = ALVais, y, Z) + A2Vg2(x, y, Zz) 


if and only if either 


1 
(x,y,z) = (0,0, 1), sc ear A=0 


10.15 Differentiable Manifolds 299 


or 
( ) (- 2 -) 2 1 : 
x,y, = aoa (ey ES SS SS 
ee at a i a 
Since f(0, 0, 1) = 1 and f (3, 3, —4) a —i, by the preceding considerations and 
Corollary 10.36, we conclude that (0, 0, 1) is a maximum point and (5; 3, —3) isa 


minimum point of f on S. 


10.15 Differentiable Manifolds 


There is an alternative way of looking at some geometrical objects such as “curves” 
and “surfaces.” The intuitive idea is that they locally “look the same” as a straight 
line or a plane. In other words, when observing these objects from a very small 
distance, they look “almost flat.” 

We will now make this idea precise, in a general finite-dimensional context. Thus, 
let .@ be a subset of R. 


Definition 10.37 The set .@ is an “M-dimensional differentiable manifold,’ with 
1 < M < N (ora “M-manifold” for short) if, taking a point x in .@, there 
are an open neighborhood A of a, an open neighborhood B of 0 in RY, and a 
diffeomorphism g : A — B such that g(a) = 0 and either 


(a) p(ANA) ={y = (O1,..-, yw) € Bt ym4i =: = yw =O} 
or 
(b) p(AN.@) ={y = (1,..., yn) € Bt ymt1 =-+: = yn = Oand yy = 0}. 


It can be seen that (a) and (b) cannot hold at the same time. The points x for 
which (b) is verified make up the “boundary” of .@, which we denote by 0.4. If 
0M is empty, we are speaking of an M-manifold without boundary; otherwise, .W 
is sometimes said to be an M-manifold with boundary. 

First, note that the boundary of a differentiable manifold is itself a differentiable 
manifold, with a lower dimension. 


Theorem 10.38 The set 0.4 is a (M — 1)-manifold without boundary, i.e., 
d(dM)=@. 

Proof Taking a point x in 0.@, there are an open neighborhood A of x, an open 

neighborhood B of 0 in R", and a diffeomorphism gy : A — B such that g(a) = 0 


and 


g(AN.@) = {y=(y1,..., yn) EB: ym+1 =--- = yn = Oand yy = O}. 


300 10 The Differential 


Based on the fact that the conditions (a) and (b) of the definition cannot hold 
simultaneously for any point of @, it is possible to prove that 


gp ANIM) = {y= (y1,---, yn) € Bi ym = ym4i =-:- = yn =O}. 
This completes the proof. | 


There are many examples of manifolds: Circles, spheres, and toruses are 
manifolds without boundary. A hemisphere is a 2-manifold whose boundary is a 
circle. However, a cone is not a manifold because of a single point, its vertex. Notice 
that any open set in R% is an N-manifold (without boundary). 

Let us now see that, given an M-manifold.@, corresponding to each of its points 
Z it is possible to find a local M-parametrization. 


Theorem 10.39 For every x € -@ there is a neighborhood A’ of x such that 
A’. can be M-parametrized with an injective function o : I > RN, where I is 
a rectangle of R™ of the type 


ee ae ifa gd, 
~ |[-a,a]”-! x [0,e] fread, 


and o(0) = 2x. Moreover, if &% is a point of the boundary 0.4, the M- 
parametrization o is such that the interior points of a single face of rectangle I 
are sent ondM. 


Proof Consider the diffeomorphism gy : A > B given by the preceding definition, 
and take an a > 0 such that the rectangle B’ = [—a, w]’’ is contained in B. Setting 
A’ = g7'(B’), we have that A’ is a neighborhood of 2 (indeed, the set B” = 
] — a, a[% is open and, hence, also A” = g~!(B”) is open, and w € A” C A’). 
We can then take rectangle / as in the statement and define o (uw) = g~!(u, 0). It is 
readily seen that o is injective and o (1) = A'N.@. Moreover, g1,... my (o(W)) = U 
for every u € I; hence, Jg,...m)(a(u)) - Jo(u) is the identity matrix, so that 
Joa(u) has rank M for every u € I. 
Finally, if a € 0.Z@, then 


[—a, a]@—! x {0} =a '(A'NI.D), 
thereby completing the proof. a 


Notice that the function o is indeed defined on an open set containing /, and it is 
injective there. 


Check for 
updates 


In this chapter we extend the theory of the integral to functions of several variables 
defined on subsets of R, with values in R. For simplicity, in the exposition we will 
first focus our attention on the case N = 2 and later provide all the results in the 
case of a generic dimension N. 


11.1 Integrability on Rectangles 


We begin by considering the case of functions defined on rectangles. We recall that 
a “rectangle” of RY is a set of the type [a1, bi] x --- x [ay, by]. In the following 
exposition, we concentrate for simplicity on the two-dimensional case. The general 
case is largely identical and does not involve greater difficulties, except for the 
notations. 

We consider the rectangle J = [a, b1] x [a2, b2] © IR? and define its measure 


MU) = (b1 — a) (b2 — a2). 


As a particular case, given x = (x, y) € J andr > 0, we have 


Bl@,r]}=[x—-r,x+r])xly-nytr]; 


it is the square centered at x having r as half of the length of its sides. We say that 
two rectangles are “nonoverlapping” if their interiors are disjoint. 
A “tagged partition” of the rectangle J is a set 


P = {(x1, qh), (x2, bh), pons (Lm, In)} ’ 
where the J; are nonoverlapping rectangles whose union is / and, for every j = 


1,...,m, the point 2; = (x;, y;) belongs to J;. 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 301 
A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978-3-031-23713-3_11 


302 11. The Integral 


Example If I = [0, 10] x [0, 6], a possible tagged partition is the following: 


P= {((1, 1), [0, 7] x [0, 21), (0, 5), [0, 3] x [2, 61), 
((5, 4), [3, 10] x [4, 6]), (10, 0), [7, 10] x [0, 4]), ((5, 3), [3, 7] « [2, 4)}. 


Let us now consider a function f defined on the rectangle 7, with values in R, 
and let P = {(a;, 7j): j =1,...,m} bea tagged partition of J. We call “Riemann 
sum” associated with f and P the real number S( Tt: P) defined by 


m 
SCF,P) = > flepudy). 
j=l 
Whenever f happens to be positive, this number is the sum of the volumes of the 


parallelepipeds having as base J; and height [0, f(a ;)]. 


X3 


11.1 Integrability on Rectangles 303 


We call a “gauge” on I every positive function 5 : J — R. Given a gauge 6 on 
I, we say that the tagged partition P introduced previously is “d-fine” if, for every 
jJ=1,...,m, 


Ty © Blan oe) . 


Example Let I = [0, 1] x [0, 1] and 6 be the gauge defined as follows: 


zTY ita.y) 40,0), 
ba. y=4 13 
5 if (x, y) = ©, 0). 


We want to find a 6-fine tagged partition of 7. Much like what we saw at the end of 
Sect. 7.2, in this case one of the points a ; necessarily must coincide with (0, 0). We 
can then choose, for example, 


A ={(o0 [0] [o)). (ge). fos] « [ep 
(103) [21] «[og)) (on. Ba 


It is interesting to observe that it is not always possible to construct 6-fine tagged 
partitions by only joining points on the edges of 7. The reader may become 
convinced when attempting to do this using the following gauge: 


*tY ita.y) 40,0), 
bx y= 4 116 
5 if (x, y) = (0,0). 


As in the one-dimensional case, one can prove that for every gauge 6 on J there 
exists a 6-fine tagged partition of J (see Cousin’s Theorem 7.1). The following 
definition is identical to the one given in Chap. 7. 


Definition 11.1 A function f : J — R is said to be “integrable” (on the rectangle 
I) if there is a real number 7 with the following property. Given ¢ > 0, it is possible 
to find a gauge 6 on J such that, for every 6-fine tagged partition P of J, 


ISP) —J| <e. 


We briefly overview all the properties that can be obtained from the given 
definition in the same way as was done in the case of a function of a single variable. 
First of all, there is at most one 7 € R that verifies the conditions of the 
definition. Such a number is called the “integral” of f on J and is denoted by one 


304 11. The Integral 


of the following symbols: 


Le [s@ae, [ fe. svaray. 
I 1 1 


The set of integrable functions is a real vector space, and the integral is a linear 


function on it: 
futoafrefe. fonsalfs 
I I I I I 


(with a € R). It preserves the order 


f<s = [rsfe. 


The Cauchy criterion of integrability holds. 


Theorem 11.2 (Cauchy Criterion) A function f : I > Ris integrable if and only 
if for every € > O there is a gauge 5 : I — R such that, taking two 5-fine tagged 
partitions P, Q of I, we have 


ISCfP) — Sf, Di Se. 
Moreover, we have the following property of “‘additivity on subrectangles.” 


Theorem 11.3 Let f : I — R bea function and K\, K2,..., Ki be nonoverlap- 
ping subrectangles of I whose union is I. Then f is integrable on I if and only if it 
is integrable on each of the K;. In that case, we have 


four. 


In particular, if a function is integrable on a rectangle, it is still on every 
subrectangle. The proof of the theorem is similar to that of Theorem 7.18 and is 
based on the possibility of constructing a gauge that would allow us to split the 
Riemann sums on J into Riemann sums on the single subrectangles. 

We say that an integrable function on / is “R-integrable” there (or integrable 
according to Riemann) if, among all possible gauges 5 which verify the definition 
of integrability, it is always possible to choose one that is constant on J. The set of 
R-integrable functions is a vector subspace of the space of integrable functions and 
contains the subspace of continuous functions. 

We say that an integrable function f : J — R is “L-integrable” (or integrable 
according to Lebesgue) if | f| is integrable on J as well. The L-integrable functions 
make up a vector subspace of the space of integrable functions. If f and g are 


11.2 Integrability on a Bounded Set 305 


two L-integrable functions on J, then the functions min{ f, g} and max{/f, g} are 
L-integrable on J, too. A function f is L-integrable if and only if both its positive 
part f* = max{ f, 0} and its negative part f— = max{—/, 0} are integrable. 

The Saks—Henstock Theorem 9.1, Monotone Convergence Theorem 9.10, and 
Dominated Convergence Theorem 9.13 extend to the integrable functions on a 
rectangle, with statements and proofs perfectly analogous to those provided in 
Chap. 9. 


11.2  Integrability on a Bounded Set 


We will now provide the definition of the integral on an arbitrary bounded domain. 
Given a bounded set F and a function f : E — R, we define the function fz : 
RY — Ras follows: 


_ | f(x) ifv@eek, 
fe(a)=| 0 ifadE. 


We are thus led to the following definition. 


Definition 11.4 Given a bounded set E, we say that the function f : E > Ris 
“integrable” (on E) if there is a rectangle J containing the set FE on which fg is 


integrable. In that case, we set 
/ f= / SE. 
E I 


To verify the consistency of the preceding definition, we will now show that when 
f is integrable on E, we have that fg is integrable on any rectangle containing the 
set E,, and the integral of ff remains the same on each such rectangle. 


Proposition 11.5 Let I and J be two rectangles containing the set E. Then fr is 
integrable on I if and only if it is integrable on J. In that case, we have 5 fE= 


Sy fe - 


Proof We consider for simplicity the case N = 2. Assume that fg is integrable 
on J. Let K be a rectangle containing both 7 and J. We can construct some 


nonoverlapping rectangles K,,..., K;, also nonoverlapping with J such that J U 
K,U---UK;, = K. We now prove that ff is integrable on each of the subrectangles 
K,,..., K; and that the integrals Sk, ie Sk, fe are all equal to zero. Notice 


that fr, restricted to each of these subrectangles, is zero everywhere except perhaps 
on one of their edges. We are thus led to prove the following lemma, which will 
permit us to conclude the proof. | 


306 11. The Integral 


Lemma 11.6 Let K be a rectangle and g : K — R be a function which is zero 
everywhere except perhaps on one edge of K.. Then g is integrable on K and f, Kg= 
0. 


Proof We first assume that the function g is bounded on K, i.e., that there is a 
constant C > 0 for which 


lea, yI<C 


for every (x, y) € K. Fix e > 0. Let L be the edge of rectangle K on which g may 
be nonzero, and denote by ¢ its length. Define the constant gauge 6 = g;. Then, for 


every 6-fine tagged partition P= {(a@1,11),..., (@m, Im)} of K, we have 


Sg. P< oig@plud)= Y2 le@plud) 


< >} Cua) <Cét=s. 
{j: 2jEL} 


This proves that g is integrable on K and /, x & = in the case where g is bounded 
on XK. If g is not bounded on K, assume that it has nonnegative values. Define the 
following sequence (gy) of functions as 


8n(L) = min{g(x),n}. 


Since the functions g, are bounded, for what we saw earlier we have i Kon = 0, 
for every n. It is easily seen that the sequence thus defined satisfies the conditions 
of the Monotone Convergence Theorem 9.10 and converges pointwise to g. It then 
follows that g is integrable on K and 


| e=tim gn = 0. 
K n JK 


If g does not have only nonnegative values, it is always possible to consider gt 
and g~. From the preceding discussion, fy gt = fic g7 = 0, and then f,. g = 
Sct — Sic 8” =, which is what we wanted to prove. a 


End of Proof of Proposition 11.5 Having proved that fg is integrable on each of 
the K,,..., K, and that the integrals Se, fEs-++s Sk, fe are equal to zero, by the 
theorem of additivity on subrectangles we have that, since fr is integrable on J, it 
is also integrable on K, and 


[ se= | fe+ fet t | tem | te. 


11.3. The Measure 307 


But then fg is integrable on every subinterval of K, and in particular on J. We 
can now construct, analogously to what was just done for J, some nonoverlapping 
rectangles J},..., Js, also nonoverlapping with J, such that JU J, U---UJ; = K. 
Similarly, we will have 


[mea fterf paste +f fe= [ fe. 


which proves that /; JE = if: ) fe. To see that the condition is necessary and 
sufficient, simply exchange the roles of J and J in the foregoing proof. 

With the given definition, all the properties of the integral seen earlier easily 
extend to this setting. There is an exception concerning the additivity since it is not 
true in general that a function that is integrable on a bounded set remains integrable 
on any of its subsets. Indeed, take a function f : E — R, which is integrable but 
not L-integrable. We consider the subset 


= {xe E: f(x) = 90}, 


and we claim that f cannot be integrable on E’. If it were, then f* would be 
integrable on E. But then f— = f* — f would also be integrable on E, and 
therefore f should be L-integrable on EF, in contradiction with the assumption. 

We will see that, with respect to additivity, the L-integrable functions have a 
somewhat better behavior. 


11.3 The Measure 
We now give the definition of “measure” for a bounded subset of R™. 


Definition 11.7 A bounded set E is said to be “measurable” if the constant 
function 1 is integrable on E. In that case, the number /, gf 1 is said to be the 
“measure” of E and is denoted by w(£). 


The measure of a measurable set is thus a nonnegative number. The empty set is 
assumed to be measurable, and its measure is equal to 0. In the case of a subset of 
IR2, its measure is also called the “area” of the set. If E = [a1, b1] x [an, bo] isa 
rectangle, it is easily seen that 


M(E) = iE 1 = (bj — a1) (b2 — a2), 


so that the notation is in accordance with the one already introduced for rectangles. 
For a subset of R?, the measure is also called the “volume” of the set. 


308 11. The Integral 


Let us analyze some properties of the measure. It is useful to introduce the 
characteristic function of a set FE, defined by 


1 ifvwee£, 


xe =| 0 ifa¢e. 


If J is a rectangle containing the set EF, we thus have 


we) = | xe. 
I 


Proposition 11.8 Let A and B be two measurable bounded sets. The following 
properties hold: 


(a) If A C B, then B \ A is measurable, and 
LB \ A) = u(B) — MMA); 


in particular, (A) < WCB). 
(b) AUB and AN B are measurable, and 


M(AU B)+ “(AN B) = pA) + LB); 
in particular, if A and B are disjoint, then u(A U B) = w(A) + w(B). 
Proof Let I be a rectangle containing A U B. If A C B, then xg\4 = xB — Xa, 
and property (a) follows integrating on J. 


Since x4uB = max{x,, xB} and xang = min{x,, xB}, we have that xy 4ug and 
XAnB are integrable on J. Moreover, 


XAUB + XANB = XAT XB; 
and integrating on J, we have (b). a 


The following proposition states the Complete Additivity property of the 
measure. 


Proposition 11.9 [f (Ag)x>1 is a sequence of measurable bounded sets whose 
union A = Ug>1 Ax is bounded, then A is measurable, and 


(A) < >) u(An)- 


k=1 


If the sets Ay are pairwise disjoint, then equality holds. 


11.3. The Measure 309 


Proof Assume first that the sets Ag are pairwise disjoint. Let 7 be a rectangle 
containing their union A. Then, for every x ¢€ J, 


xa(@) = D> xa, (@). 
k=1 


Moreover, since for every positive integer g we have 
q q 
Youd =“ 4 U Ar) <u, 
k=1 k=1 


the series )°e fy XAx = dope (Ak) converges. By Corollary 9.11, we have that 
A is measurable and 


way= fxs a) = fx = Do uAy). 
! ' ke k=1°! k=1 


When the sets A; are not pairwise disjoint, consider the sets B} = A;, By = A2\A 
and, in general, Bk = Ax \ (A, U--- U Ag_i). The sets By are measurable and 
pairwise disjoint, and Ug>1 By = Ux>1 Ax. The conclusion then follows from what 
was proved earlier. | 


We have a similar proposition concerning the intersection of a countable family 
of sets. 


Proposition 11.10 Jf (Ax)x>1 is a sequence of measurable bounded sets, their 
intersection A = > Ax is a measurable set. 


Proof Let I be a rectangle containing the set A. Then, since A = Mys1 (A, NJ), 
we have that 


Ma=1\(Uavann), 


k>1 k>1 
and the conclusion follows from the two previous propositions. a 


The following two propositions will provide us with a large class of measurable 
sets. 


Proposition 11.11 Every open and bounded set is measurable. 


Proof Consider for simplicity the case N = 2. Let A be an open set contained in a 
rectangle J. We divide the rectangle / into four rectangles of equal areas using the 


310 11. The Integral 


axes of its edges. Then we proceed analogously with each of these four rectangles, 
thereby obtaining 16 smaller rectangles, and so on. Since A is open, for every @ € 
A there is a small rectangle among those just constructed that contains x and is 
contained in A. In this way, it is seen that set A is covered by a countable family of 
rectangles; being the union of a countable family of measurable sets, it is therefore 
measurable. a 


Proposition 11.12 Every compact set is measurable. 

Proof Let B be a compact set, and let J be a rectangle whose interior if contains B. 
Since / and I \ B are open and, hence, measurable, we have that B = I \ d \ B) is 
measurable. a 
Example The set 


E={(x,y)€R*:1<x?+y? <4} 


is measurable, since it is the difference of the closed disks with radius 2 and 1 
centered at the origin, i.e., 


E={(x,yeR ix? +y? <4) \{@,y) eR: x? +y? <1}. 


11.4 Negligible Sets 


Definition 11.13 We say that a bounded set is “negligible” if it is measurable and 
its measure is equal to zero. 


Every set made of a single point is negligible. Consequently, all finite or 
countable bounded sets are negligible. The edge of a rectangle in R? is a negligible 
set, as shown by Lemma 11.6. 

By the complete additivity of the measure, the union of any sequence of 
negligible sets, if it is bounded, is always a negligible set. 


Theorem 11.14 /f E is a bounded set and f : E — R is equal to zero except for a 
negligible set, then f is integrable on E and te f= 


Proof Let T be the negligible set on which f is different from zero. Assume first 
that the function f is bounded, i.e., that there is a constant C > 0 such that 


If(@)l<c 


11.4 Negligible Sets 311 


for every x € E. We consider a rectangle J containing E and prove that , JE = 9. 
Fix ¢ > 0. Since T has zero measure, there is a gauge 6 such that, for every 6-fine 
tagged partition P = {(@;,1j), 7 =1,...,m} of J, 


Sar. Py= Yo wdp<s. 


so that 


Sf Pi< >> lf@plua)ysc >) ud) <e. 


{j:@j€T} {j:@j€T} 


Hence, if f is bounded, it is integrable on E and [ rf = 9. If f is not bounded, 
assume first that it has nonnegative values. Define a sequence of functions f, : E > 
Ras 


Sn(x) = min{ f(x), n}. 


Since the functions f,, are bounded and zero except on T,, for what we just saw they 
are integrable on E, with [ Jn = O, for every n. It is easily seen that the defined 
sequence satisfies the conditions of the Monotone Convergence Theorem 9.10 and 
converges pointwise to f. Hence, f is integrable on F, and 


[f=im[ te=0. 


If f does not have nonnegative values, it is sufficient to consider f+ and f~ and 
apply to them what was said earlier. a 


Here is a counterpart of the foregoing result. 
Theorem 11.15 /f f : E — Ris an integrable function on a bounded set E, having 
nonnegative values, with fe f =, then f is equal to zero except on a negligible 
set. 


To prove this, we need the following Chebyshev inequality. 


Lemma 11.16 Let E be a bounded set and f : E — Ran integrable function with 
nonnegative values. Then, for every r > 0, the set 


E,={xeE: f(x) >r} 


312 11. The Integral 


is measurable, and 
1 
W(E;) < - Is 
JE 
Proof Let I be a rectangle containing E. Once we have fixed r > 0, we define the 
functions f, : 1 > Ras 
fn(@) = min{1,n max{ fz(a@) — r, O}}. 


These make up an increasing sequence of L-integrable functions that pointwise 
converges to xpf,. Clearly, 


O< fi(@) <1 foreveryn andeverya el. 


The Monotone Convergence Theorem 9.10 guarantees that xz, is integrable on /, 
ie., that E, is measurable. Since, for every x € E, we have rxz,(@) < f(x), 
integrating both sides of this inequality we obtain the inequality we are looking 
for. a 


Proof of Theorem 11.15 Using the Chebyshev inequality, we have that, for every 
positive integer k, 


me) skf =o. 
E E 


Hence, every E1 is negligible, and since their union is just the set where f is 
k 
different from zero, we have the conclusion. | 


Definition 11.17 Let E be a bounded set. We say that a proposition is true “almost 
everywhere” on E (or for almost every point of £) if the set of points for which it is 
false is negligible. 


The results just proved have the following simple consequence. 


Corollary 11.18 Jftwo functions f and g, defined on the bounded set E, are equal 
almost everywhere, then f is integrable on E if and only if g is. In that case, ie f= 


Se 8: 


Proof In such a case the function g — f is equal to zero almost everywhere, hence 
Sng — f) =0. Then 


[r-fr+fe-p-fut+e-m=fis. 


thereby completing the proof. a 


11.5 A Characterization of Measurable Bounded Sets 313 


This last corollary permits us to consider some functions that are defined almost 
everywhere and to define their integral. 


Definition 11.19 A function f, defined almost everywhere on £, with real values, 
is said to be “integrable” on E if it can be extended to an integrable function g : 
E — R. In this case, we set f, f = fp 8- 


The preceding definition is consistent since the integral will not depend on the 
particular extension. 

It can be seen that all the properties and theorems seen till now remain true for 
such functions. The reader is invited to verify this. 


11.5 A Characterization of Measurable Bounded Sets 
The following covering lemma will be useful in what follows. 


Lemma 11.20 Let E be a set contained in a rectangle I, and let 6 be a gauge 
on E. Then there is a finite or countable family of nonoverlapping rectangles Jk, 
contained in I, whose union covers the set E, with the following property: In each 
of the sets Jy, there is a point xx belonging to E such that J, © Blak, 6(xx)]. 


Proof We consider for simplicity the case N = 2. Let us divide the rectangle / 
into four rectangles, having the same areas, by the axes of its edges. We proceed 
analogously with each of these four rectangles, obtaining 16 smaller rectangles, and 
so on. We thus obtain a countable family of smaller and smaller rectangles. For 
every point x of E we can choose one of these rectangles that contains 2 and is 
itself contained in B[x, 5(a)]. These rectangles would satisfy the properties of the 
statement if they were nonoverlapping. 

In order that the sets J; be nonoverlapping, it is necessary to choose them 
carefully, and here is how to do it. We first choose those from the first four- 
rectangle partition, if there are any, that contain a point a, belonging to E such that 
Ji © Blatg, 5(ax)]; once this choice has been made, we eliminate all the smaller 
rectangles contained in them. We consider then the 16-rectangle partition and, 
among those that remained after the first elimination procedure, we choose those, if 
there are any, that contain a point a; belonging to E such that J, C Bla, 6(ax)); 
once this choice has been made, we eliminate all the smaller rectangles contained in 
them; and so on. |_| 


Remark 11.21 Note that if, in the assumptions of the covering lemma, it happens 
that E is contained in an open set that itself is contained in /, then all the rectangles 
J; can be chosen so that they are all contained in that open set. 


We can now prove a characterization of measurable bounded sets. In the 
following statement, the words in square brackets may be omitted. 


314 11. The Integral 


Proposition 11.22 Let E be a bounded set, contained in a rectangle I. The 
following three propositions are equivalent: 


(i) The set E is measurable. 


(ii) For every € > 0 there are two finite or countable families (Jx) and (J;), each 
made of [nonoverlapping] rectangles contained in I, such that 


pene NE S| Se and «((Ux)n (Ux) <¢ 


(iii) There are two sequences (En)n>1 and (Ej, )n>1 of measurable bounded subsets 
such that 


E,GECEn,  lim(u(En) — w(E,)) = 0. 
n 


=f 


In that case, we have 
W(E) = lim M(En) = lim B(E,). 


Proof Let us first prove that (7) implies (ii). Assume that E is measurable, and fix 
€ > 0. By the Saks—Henstock Theorem 9.3, there is a gauge 6 on J such that, for 
every 6-fine tagged subpartition P = {(@;, Kj): j =1,...,m}of J, 


m 


> 


j=l 


XE(@;j)U(K;) -[ XE <5. 
re 


By Lemma 11.20, there is a family of nonoverlapping rectangles J;, contained 
in J, whose union covers E, and in each Jz there is a point x, belonging to E 
such that J, C Blaz, d(x,)]. Let us fix a positive integer N and consider only 
(a1, J1),..., (@n, Jy). They make up a 6-fine tagged subpartition of 7. From the 
preceding inequality we then deduce that 


mdf XE 


3 < 


k= 


€ 
2.” 


an 


whence 


Sa => f XE+ =< fet 5=mersy. 


11.5 A Characterization of Measurable Bounded Sets 315 


Since this holds for every positive integer NV, we have thus constructed a family (Ji) 
of nonoverlapping rectangles such that 


ECU K. Youd) sme) +5. 
k k 


Consider now IJ \ E, which is also measurable. We can repeat the same procedure 
that we just followed replacing E with J \ E, thereby finding a family (J;) of 
nonoverlapping rectangles, contained in J, such that 


INVESCO A, Sled snd \ +5. 
k k 


Consequently, 


and hence 


u( (La) 


at 
Nay 
wa” 
II 
= 
— 
a—™~ 
mCS 
= 
SS 
— 
“—™~ 
~ 
a 
PS 
oe 
ak 
SS 
a 
a 


awl (U9) 
= (U Ik) = mw) +a(L ji) 
( 


< (mB) +5) — 2) + (wd \ £) +5) 
=€, 


and the implication is thus proved. 
Taking ¢ = 1 it is easy to see that (i7) implies (iii). Let us prove now that (ii7) 
implies (i). Consider the measurable sets 


for which it must be that 


E’CECE, w(E)=yu(E). 


316 11. The Integral 


Equivalently, we have 


XB S XESS XE; [ce -1e) =0. 


so that x = Xe = Xg almost everywhere. Then F is measurable and w(E) = 
u(E’) = w(E). Moreover, 


0 < limlw(E) — WCE] < lim[w(En) — WCE] = 0, 


hence .(E) = lim, (E£/,). Analogously, we see that w(E) = lim, w(E,), and the 
proof is thus completed. | 


Proposition 11.23 Let E be a bounded set. Then E is negligible if and only if, for 
every € > 0, there is a finite or countable family (Jk) of [nonoverlapping] rectangles 
such that 


ECUA, Died<e. 
k k 


Proof The necessary condition is proved in the first part of the previous proposition. 
Let us prove the sufficiency. Once fixed ¢ > 0, assume there exists a family (Jx) 
with the given properties. Let J be a rectangle containing the set E. On the other 
hand, consider a family (Jp) whose elements all coincide with J. The conditions of 
the previous proposition are then satisfied, so that E is indeed measurable. Then 


M(E) < u(U%) < Du) <6; 
k k 


since é€ is arbitrary, it must be that w(E’) = 0. a 


Remark 11.24 Observe that if F is contained in an open set that is itself contained 
in a rectangle /, then all the rectangles J, can be chosen in such a way that they are 
all contained in that open set. 


As a consequence of the preceding proposition, it is not difficult to prove the 
following corollary. 


Corollary 11.25 If Iy— is a rectangle in RN~! and T is a negligible subset of R, 
then Iy_ x T is negligible in RN. 


11.6 Continuous Functions and L-Integrable Functions 317 


Proof Fix « > 0, and, according to Proposition 11.23, let (Jg) be a finite or 
countable family of intervals in R such that 


Bees DH) s as a 


Defining the rectangles Jk = Iy_, X Je, we have that 


IyvaxT CU ke. 2 mCh) = wn D 2m) s wn 1) —— 


k 


’ 


ne ay 


and Proposition 11.23 applies. a 


11.6 Continuous Functions and L-Integrable Functions 


We begin this section by showing that the continuous functions are L-integrable on 
compact sets. 


Theorem 11.26 Let E C R™ be a compact set and f : E > R be a continuous 
function. Then f is L-integrable on E. 


Proof We consider for simplicity the case N = 2. Since f is continuous on a 
compact set, there is a constant C > 0 such that 


|f(a@)|<C, foreveryxe Fk. 


Let J be a rectangle containing F. First we divide J into four rectangles by tracing 
the segments joining the midpoints of its edges; we denote by U1,1, Uj,2, U1,3, U1,4 
these subrectangles. We now divide again each of these rectangles in the 
same way, thereby obtaining 16 smaller subrectangles, which we denote by 
U2,1, U2,2,..., U2,16. Proceeding in this way, for every n we will have a subdivision 
of the rectangle / into 2?” small rectangles Un,j, with j =1,..., 27", Whenever E 
has a nonempty intersection with Ui j» we choose and fix a point &,,; € EN oe 
Define now the function f, in the following way: 


e IFEN Un, j 18 nonempty, then f,, is constant on Ua, j with value f(a, ;). 
° FEN Uy, j is empty, then f, is constant on Ue j with value 0. 


The functions f,, are thus defined almost everywhere on J, not being defined only 
on the points of the grid made up by the previously constructed segments, which 
form a countable family of negligible sets. The functions f;, are integrable on each 
subrectangle U,,, ;, since they are constant in its interior. By the property of additivity 


318 11. The Integral 


on subrectangles, these functions are therefore integrable on J. Moreover, 
|fn(z)|< C,  foralmost every a € J andeveryk > 1. 


Let us see now that (f;,), converges pointwise almost everywhere to fr. Indeed, 
taking a point a € J not belonging to the grid, for every n there is a j = j(n) for 
which a@ € Un, j(n). We have two possibilities: 


(a) x ¢ E; in this case, since E is closed, we have that, for n sufficiently large, 
Un, j(n) (whose dimensions tend to zero as n — oo) will have an empty 
intersection with EF, and then f, (a) =0= fe(2). 

(b) x € E; in this case, ifn — +00, we have that &p jn) > & (again using the 
fact that c j(n) has dimensions tending to zero). By the continuity of f, we 
have that 


Sn(@) = f (Ln, jn) > f(@) = fe(@). 
The Dominated Convergence Theorem 9.13 then yields the conclusion. a 
We now see that L-integrability is conserved on measurable subsets. 


Theorem 11.27 Let f : E —> R be a L-integrable function on a bounded set E. 
Then f is L-integrable on every measurable subset of E. 


Proof Assume first that f has nonnegative values. Let S be a measurable subset 
of E, and define on E the functions f, = min{ f,nxs}. They form an increasing 
sequence of L-integrable functions since both f and nys are L-integrable, and the 
sequence converges pointwise to fs. Moreover, we have 


[mele 


for every n. The Monotone Convergence Theorem 9.10 then guarantees that f is 
integrable on S in this case. In the general case, since f is L-integrable, both f* 
and f~ are L-integrable on E. Hence, based on the preceding discussion, they are 
both L-integrable on S, and then f is, too. a 


Let us now prove the complete additivity property of the integral for L- 
integrable functions. We will say that two measurable bounded subsets are 
“nonoverlapping” if their intersection is a negligible set. 


11.7 Limits and Derivatives under the Integration Sign 319 


Theorem 11.28 Let (E;) be a finite or countable family of measurable nonover- 
lapping sets whose union is a bounded set E. Then f is L-integrable on E if and 
only if the following two conditions hold: 


(a) f is L-integrable on each Ex. 
(b) te, | f(x@)|dx < +00. 


fP-Ehh 


In that case, we have 


Proof Observe that 


io=) a@: iol lin 
k k 


for almost every x € E. If f is L-integrable on E, then from the preceding theorem 
(a) follows. Moreover, it is obvious that (b) holds whenever the sets Fx are finite in 
number. If instead they are infinite, then for any fixed n we have 


PT \f@y|de = > | fe @lax= | ify wan < [ | f(a@)| dx, 
kal 2 Ek Ka 2E E yay E 


and (b) follows. 

Assume now that (a) and (b) hold. If the sets E, are finite in number, it is 
sufficient to integrate on E both terms in the equation f = )°, fz,. If instead they 
are infinite, assume first that f has nonnegative values. In this case, Corollary 9.11, 
when applied to the series }°, fz,, yields the conclusion. In the general case, it is 
sufficient to consider, as usual, the positive and negative parts of f. | 


11.7 Limits and Derivatives under the Integration Sign 
Let X be a metric space, Y be a bounded subset of R™, and consider a function 


f :X x Y — R. (For simplicity, we may think of X and Y as subsets of R.) The 
first question we want to address is when the formula 


lim i fir.»)dy) = ( lim f(x. y))dy 
x—>xXx0 y y \x>X0 


holds. What follows is a generalization of the Dominated Convergence Theo- 
rem 9.13. 


320 11. The Integral 


Theorem 11.29 Let x9 be an accumulation point of X, and let the following 
assumptions hold true: 


(a) For every x € X \ {xo}, the function f (x, -) is integrable on Y, so that we can 
define the function 


F< i FG. as 


(b) For almost every y € Y the limit lim,_+ x, f (x, y) exists and is finite, so that we 
can define almost everywhere the function 


my) ten PO) 
(c) There are two integrable functions g,h : Y — R such that 
g(y) S fy) Sh) 
for every x € X \ {xo} and almost every y € Y. 


Then yn is integrable on Y, and we have 


in Pe / nO)ay. 
x—>Xx0 Y 


Proof Let us take a sequence (x,), in X \ {xo} that tends to x9. Define, for every n, 
the functions f, : Y — R such that f,(y) = f (im, y). The assumptions allow us to 
apply the Dominated Convergence Theorem 9.13, so that 


lim Fn) = lim ( i! fu(y) dy) = | (tim fon)ay =f novay. 
n n 4 Y n Y 


The conclusion then follows from the characterization of the limit by the use of 
sequences (Proposition 4.3). a 


We have the following consequence of the above theorem. 


Corollary 11.30 If X is a subset of R“, Y C RN is compact, and f :X x Y > R 
is continuous, then the function F : X — R, defined by 


ee [ feeryay, 


is continuous. 


11.7 Limits and Derivatives under the Integration Sign 321 


Proof The function F (x) is well defined, since f(x, -) is continuous on the compact 
set Y. Let us fix x9 € X and prove that F is continuous at x9. By the continuity of f, 


ny) = lim f(x,y) = fo, y). 
xX>xX0 
Moreover, given a compact neighborhood U of xo, there is a constant C > 0 such 


that | f(x, y)| < C for every (x, y) € U x Y. The previous theorem can then be 
applied, and we have 


lim F(x) = / f (xo, y) dy = F(xo), 
xX Y 


thereby proving that F is continuous at xo. a 


Now let X be a subset of R. The second question we want to address is when the 


formula 
d ) 
<(/ fen ray) = [ (Zom)e 
Xx Y Y Ox 


holds. Here is an answer. 


Theorem 11.31 (Leibniz Rule) Let X be an interval in R containing xo, and let 
the following assumptions hold true: 


(a) For every x € X the function f (x, -) is integrable on Y, so that we can define 
the function 


F=f feydy. 


(b) For every x € X and almost every y € Y the partial derivative Hf (x, y) exists. 
(c) There are two integrable functions g,h : Y — R such that 


af 
gy) = —@, y) Shy) 
ox 
for every x € X and almost every y € Y. 


Then the function # (x, -), defined almost everywhere on Y, is integrable there, the 
derivative of F in xo exists, and we have 


a 
F'(xo) = i. (Zo.0) dy. 


322 11. The Integral 


Proof We define, for x € X different from xo, the function 


f(x, y)- fo, y) : 


x — x0 


Wa,y= 


For every x € X \ {xo} the function w(x, -) is integrable on Y. Moreover, for almost 
every y € Y we have 


0 
lim (x, y) = ue y). 
x>xQ Ox 


By the Lagrange Mean Value Theorem 6.11, for (x, y) as previously thereisa&é € X 
between xo and x such that 


af 
wx, y) = 3x y) . 
x 
By assumption (iii), we then have 


g(y) < Ww, y) < AVY) 


for every x € X \ {xo} aa aa every y € Y. By the previous theorem, we can 
conclude that the function 2 3x F (x0, -), defined almost everywhere on Y, is integrable 


there, and 
0 
lim (/ vie y)dy) -|/ (Zo ») dy 
X>X0 y Y xX 


On the other hand, 


lim (| vie y)dy) = Fini ( Se) 
x—>x0 4 x—>X0 Yy x — x0 


= lim —(f FQ, y) dy — [ #6. »)4y) 
x>xX0 X — XQ 


_ F(x) — F(x) 
= lim ———__ 
xX—>Xx0 x — Xo 
so that F is differentiable at xo, and the conclusion holds. | 


Corollary 11.32 /f X is an interval in R, Y is a compact subset of RN, and the 
function f : X x Y — R is continuous and has a continuous partial derivative 
of :X x Y > R then the function F : X — R, defined by 


Foie i fe. yay, 


is differentiable with a continuous derivative. 


11.7 Limits and Derivatives under the Integration Sign 323 


Proof The function F (x) is well defined, since f(x, -) is continuous on the compact 
set Y. Taking a point x9 € X and a nontrivial compact interval U C X containing it, 
there is a constant C > 0 such that 2a, y)| < C for every (x, y) €e U x Y. By the 
preceding theorem, F is differentiable at x9. The same argument holds replacing xo 


with any x € X, and 
0 
ray= | (Zo. )ay, 
4 Xx 


The continuity of F’ : X — R now follows from Corollary 11.30. a 


Example Consider, for x > 0, the function 


age? 
eo Or +1) 


FO = aa 


We want to determine whether the corresponding function F(x) = i f(x, y) dy is 
differentiable and, in this case, to find its derivative. We have that 


cua y) = —2xe 
x 


—x?(y?+1) 
a ? 


which, for y € [0, 1] and x => 0, is such that 

2 2 2 y24] 

=i/—- tye? 2d FO TY 2 0, 

e 

We can then apply the Leibniz rule, so that 
! 22. 1 
F'(x) = -2x f eX OT) dy, 
0 


Let us make a digression, so as to present an elegant formula. By the change of 
variable t = xy, we have 


1 x x 2 
2 2 2 d 
- 2x f e* OP) dy = —2e~* / e dt=—-— i: a ar) ‘ 
0 0 dx 0 


Taking into account that F'(0) = 7/4, we have 


x 7 2 
F(x) = 7 = (/ ear) 


324 11. The Integral 


We would like now to pass to the limit for x — +00. Since, for x > 0, we have 


yy 
er Ort) 


——_——_ < 
yeti 


we can pass to the limit under the sign of integration, thereby obtaining 


1 px (y+) 1 en PO"t+) 
lim / ——- y= | lim ——— ]dy=0. 
H-2400$6,. “PSEA 0 \a>teo y? +1 


+00 2 
(fr 
0 4 


+00 a 
i! e'dt= Jz, 


—oo 


Hence, 


and, by symmetry, 


which is a very useful formula in various applications. 


11.8 Reduction Formula 


In this section we will prove a fundamental result that permits us to compute the 
integral of a function of several variables by an iterative process of integration of 
functions of a single variable. It will be useful to recall some notation. For any fixed 
x we will denote by f(x, -) the function y +> f(x, y). Similarly, for any fixed y we 
will denote by f(-, y) the functionx f(x, y). 

Before stating the main theorem, it will be useful to first prove a preliminary 
result. 


Proposition 11.33 Let f : 1 — R be an integrable function on the rectangle 
IT = [a1, bi] x [a2, b2]. Then, for almost every x € [a,, bi], the function f (x, -) is 
integrable on [az, b2]. 


Proof Let T © [a1, bi] be the set of those x € [a1, b1] for which f(x, -) is not 
integrable on [a2, b2]. Let us prove that T is a negligible set. For each x € T, the 
Cauchy condition does not hold. Hence, if we define the sets 


for every gauge 52 on [a2, bz] there are two 
Tn =} xX: 69-fine tagged partitions Po and O> - [a2, b2] such that ¢ , 
S(f (x, »), Po) — SF (x, »), Qa) > 1 


11.8 Reduction Formula 325 


we have that each x € T belongs to T,, if n is sufficiently large. Thus, T is the union 
of all 7,,, and if we prove that any T, is negligible, then by the properties of the 
measure we will have that T is also negligible. To do so, let us consider a certain T;, 
and fix e > 0. Since f is integrable on /, there is a gauge 5 on J such that, given 
two 6-fine tagged partitions P and Q of I , we have 


ISP) -— Sif. <=. 


n 


The gauge 6 on J determines, for every x € [a1, bi], a gauge d(x, -) on [a2, bz]. We 
now associate to each x € [a , bi] two d(x, -)-fine tagged partitions Px and OF es 
[a2, bz] in the following way: 


— Ifx € T,, we can choose Pe and OF such that 

2 2 1 

S(f (x, ), P3) — SCF, ‘), Q3) > n : 
— Ifinstead x ¢ T,,, we take Px and Ox equal to each other. 
Let us write the two tagged partitions Pi and oO; thus determined: 
PAO Soe IG) Samy, 
We define a gauge 4; on [a1, 1], setting 
51 (x) mel min{d(x, yt), ae 5(x, Vinx) b(x, yt), Satan b(x, Vix )} i 


Now let P, = {(x;, Jj): i = 1,...,k} be a 6,-fine tagged partition of [a1, b1]. We 
want to prove that S(x7,,P1) < &, ie., 


dH se. 


{i x;€Ta} 


To this end, define the following two tagged partitions of 7, which make use of the 
elements of P}: 


P= A(x. 97) pees 1,...,k, fol,...,m*}, 
O = {((x1, F7'), Si x KF ‘)ii=l,...,k, fol... my. 


They are 6-fine, and hence 


ISCfP) — Sif, | <=. 


n 


326 11. The Integral 


On the other hand, we have 


ISGP) — SCF, 9) = 
komt komt 


=o rau RGR Gs eR 


i=1 j=1 i=l gat 


k mi mi 
=|)0 wad) De FG YF KF) — DT FOG Hp ck} 
i=l 7a) ral 


k 
=o e[ SU Gi.) P2D = SUG.) =| 


i=1 


= Yo wAIS(F i, +), Pz) — SFG), OF). 


{i : x;€T,} 


Recalling that 


2 x, 9 1 
S(f (i, °), P3') — S(f (i, +), Q;') > Ao 
we conclude that 


E ° 7 1 
—=IShP)-SFOD>— DI wi, 


{i : xi€Tn} 


and hence S(xz,, Py) < e, which is what we wanted to prove. All this shows that 
the sets 7, are negligible, and therefore T is negligible, too. | 


The following theorem, due to Guido Fubini, permits us to compute the integral 
of an integrable function of two variables by performing two integrations of 
functions of one variable. 


Theorem 11.34 (Reduction Theorem—I) Let f : I — R be an integrable 
function on the rectangle I = [a,, bi] x [a2, b2]. Then: 


(a) For almost every x € [a,, bj] the function f (x, -) is integrable on [a2, bo]. 
(b) The function i SG, y) dy, defined almost everywhere on [a,, bi], is inte- 
grable there. 


11.8 Reduction Formula 327 


(c) We have 


Le [ ( P™ pox.yydy) a. 


Proof We already proved (a) in Proposition 11.33. Let us now prove (b) and (c). 
Let T be the negligible subset of [a;, b;] such that, for x € T, the function f(x, -) 
is not integrable on [a2, b2]. Since T x [az, b2] is negligible in 7, we can modify on 
that set the function f without changing the integrability properties. We can set, for 
example, f = 0 on that set. In this way, we can assume without loss of generality 
that T is empty. Let us define 


bz 
F(x) = f(x, y) dy. 


a2 


We want to prove that F is integrable on [a1, b1] and that 


Le 


Let ¢ > 0 be fixed. Because of the integrability of f on /, there is a gauge 6 on I 
such that, for every 5-fine tagged partition P of J, 


E 
<n. 
2 


sinm- fr 


For every x € [a1, bi], since f(x, -) is integrable on [a2, b2] with integral F(x), 
there exists a gauge 5* : [az, bz] > R such that, taking any 6*-fine tagged partition 
P2 of [az, b2], we have that 


: € 
ISCf(, +), P2) — F(x)| < aaa 


We can assume that 5* (y) < 6(%, y) for every (x, y) € I. Then let us choose for 
every x € [a, bi] a 5*-fine tagged partition Pi of [a2, bz] and write it explicitly as 


Prat RO j= Tce 


We will thus have that, for every x € [a , by], 


a. é 
EC) SIC Pa oa ay 


328 11 
We define a gauge 6; on [a1, b;] by setting 
51(x) = min{d(x, yy), ---, 80%, Ynx)} - 


We will prove that, for every 5,-fine tagged partition P| of [a1, by], 


<e. 


sce, Py- fs 


Let us take a 5;-fine tagged partition of [a1, b1], 
Pi ={(xi, Ji) :i =1,...,n}, 


and construct, starting from it, a 6-fine tagged partition of J, 


PAGE ap eR aa Mere ct Tao 


We have the following inequalities: 


sPo- fs < Isr. Po SUP |c4.P) =f | 


n mi 


Ss 


iad Sb jah. 


n 
=>) 
i=1 


Fou) — Of. yi MKF) wD + 5 
j=l 


n 
E€ E€ 
< ——_— (Jj zt SE. 
= 230 ay! d+5 é 


This proves that F is integrable on [a,, b;] and 


Res 


The proof is thus completed. 


The Integral 


DI FD — DIE FG, yj x «| + 5 


11.8 Reduction Formula 329 


Example Consider the function f (x, y) = x7 sin y on the rectangle J = [—1, 1] x 
[0, z]. Since f is continuous on a compact set, it is integrable there, so that 


fr-f ( [7 =*sinyay) ax 
1 


1 3 
4 
= [cos yif dx =2 f Pax = 2/5] ==, 
—l —1 3 Ea | 3 


Clearly, the following version of the Fubini theorem holds, which is symmetric 
with respect to the preceding one. 


Theorem 11.35 (Reduction Theorem—II) Let f : 1 — R be an integrable 
function on the rectangle I = (a), bi] x [a2, b2]. Then: 


(a) For almost every y € [a2, b2] the function f (-, y) is integrable on [a,, bi]. 

(b) The function i Sf (x, -) dx, defined almost everywhere on [a2, b2], is inte- 
grable there. 

(c) We have 


bo by 
ail fs, y)dx) dy. 


As an immediate consequence, we have that, if f is integrable on J = [a1, bi] x 
[a2, b2], then 


by bo bo by 
/ ( fin.y)dy) ax= | ( f(s.y)dx) dy. 
a a2 a2 a 


Therefore, if the preceding equality does not hold, then the function f is not 
integrable on /. 


Examples Consider the function 


BEY cc 
fa n=1 Gayae TENFOO, 
0 if (x, y) = (0, 0), 


on the rectangle J = [0, 1] x [0, 1]. If x 4 0, then we have 


[ x2 — y? Aoi y aT 1 
0 type Lxe+y? yo +1” 


330 11. The Integral 


so that 


i is rie F r A 1 ji : 7 1 
—————— = —>— ax = {arctan Ss 
0 Wo G2 ye Podge eI eee 


Analogously, we see that 


1 1 42 2 
xe Ww 
{UC eete)ont, 
0 Wo G+ y*) 4 
and we thus conclude that f is not integrable on /. 


As a further example, consider the function 


xy : 
He oy) #00), 


fa,y=4 t+ 
0 if (x, y) = (0,0), 


on the rectangle J = [—1, 1] x [—1, 1]. In this case, if x # 0, we have 


1 x = y=l 
i 2 ar ty=| = 5 =0, 
-1 (&* + y*) 2(x* + y*) Jya-1 


1 1 ao 
———..~ dy }dx = 0. 
ie (/, (x? + y?)? v) ‘ 


Analogously, we see that 


1 1 ‘ey 
ie (/, (x? + y?)? as) eS 


Nevertheless, we are not allowed to conclude that f is integrable on 7. Actually, it 
is not at all. Indeed, if f were integrable, it should be on every subrectangle, and in 
particular on [0, 1] x [0, 1]. But if x 4 0, then we have 


so that 


Lseeldlitiaed 
0 (2+ y?) a 2(x2 + y?) yan 2a? $1)” 


which is not integrable with respect to x on [0, 1]. 
When the function f is defined on a bounded subset E of R?, it is possible to 
state the reduction theorem for the function fg. Let J = [a,, bi] x [a2, b2] be a 


11.8 Reduction Formula 331 


rectangle containing E.. Let us define the “sections” of E : 
Ex = {y € [a2, bz]: (x, y) € E}, Ey = {x € [a1,b1]: (x, y) € E}, 


and the “projections” of E : 


PLE = {x € [a1, 51]: Ex # O}, PE = {y € [a2, b2]: Ey # O}. 


‘ P\E 


We can then reformulate the Fubini theorem in the following way. 


Theorem 11.36 (Reduction Theorem—III) Let f : E — R be an integrable 
Junction on the bounded set E. Then: 


(a) For almost every x € P,E the function f(x, -) is integrable on the set Ex. 


(b) The function x te Se, f(x, y) dy, defined almost everywhere on P\E, is 
integrable there. ; 


(c) We have 


fr=fee [fle y)ay) ax. 


Analogously, the function y > Sr, f(x, y) dx, defined almost everywhere on 
P2E, is integrable there, and 


[= [ (frome) 


Example Consider the function f(x, y) = |xy| on the set 


E ={(s; y)€R*20<x% <1, =x? = y =x7}. 


332 11. The Integral 


Since f is continuous and E is compact, the theorem applies; we have P; E = [0, 1] 
and, for every x € P| E, Ey, = [—x?, x7]. Hence: 


1 x2 1 x? 1 6 
1 
feat (/ jyldy ax tf ax= | sax=[4] ==. 
E 0 =x? 0 2 Jy 0 6], 6 


As a corollary, we have a method to compute the measure of a bounded 
measurable set. 


Corollary 11.37 Jf E C R? is a measurable bounded set, then: 
(a) For almost every x € PE the set E,. is measurable. 
(b) The function x +» w(E,), defined almost everywhere on P\E, is integrable 


there. 
(c) We have 


W(E) = / W(Ex) dx. 
P\E 


Analogously, the function y +> (Ey), defined almost everywhere on PE, is 
integrable there, and 


p(E) = / p(Ey) dy. 
PoE 


Example Let us compute the area of a disk with radius R > 0: Let E = {(x, y) € 
R* : x? + y* < R?}. Since E is a compact set, it is measurable. We have that 
P| E = [-R, R] and, for every x € P| E, Ey = [-V R2 — x2, V R? — x2]. Hence: 


R x/2 
WE) = 2V R2 — x2 dx -| 2R? cos’ t dt 
—R —1/2 
= R* [t+ cost sint|"”,, = 7R*. 


In the case of functions of more than two variables, results analogous to the 
preceding ones hold true, with the same proofs. One simply needs to separate the 
variables into two different groups, calling x the first group and y the second one, 
and the same formulas hold. 


Example We want to compute the volume of a three-dimensional ball with radius 
R > 0. Let E = {(x, y,z) € RB? : x? + y? +2? < R?’}. Let us group together 
the variables (y, z) and consider the projection on the x-axis: P} E = [—R, R]. The 


11.9 Change of Variables in the Integral 333 


sections E, then are disks of radius / R2 — x2, and we have 


. x3 - 4 
we) = | (R= x°)dx = 20K) — 0] 5] =—_7R>. 
-R 3] Rr 3 


Another way to compute the same volume is to group the variables (x, y) and 
consider P| E = {(x, y): x y? = R?}. For every (x, y) € P| E we have 


Evx,y) = |- | R2 — x2 — y?, [e—2— »| , 


so that 


LE) = 2,/ R2 — x2 — y2 dx dy 
P\E 
R J R?—x? 
-|/ ( = ,2y[B = 3? — yPdy as 
= —a/ R*—x 


R 


R a /2 R 4 
= / (/ 2(R? 1°) cos?) = i a(R? — x”) dx = —1R?, 
—R \J-x/2 _R 3 


by the change of variable t = arcsin (v/v R2 — x). 


Iterating the preceding reduction procedure, it is possible to prove, for a function 
of N variables that is integrable on a rectangle 


I =[ay, b1] x [az, b2] x --- x [an, bn], 


formulas like 


by bo bn 
Ee (/ & Fostex es-eaw daw...) dan) da 
a| a2 an 


11.9 Change of Variables in the Integral 


In this section we look for an analogue to the formula of integration by substitution, 
which was proved in Chap. 7 for functions of a single variable. The proof of that 
formula was based on the Fundamental Theorem. Since we do not have such a 
powerful tool for functions of several variables, actually we will not be able to 
completely generalize that formula. 


334 11. The Integral 


For example, not only will the function g be assumed to be differentiable, but 
we will need it to be a diffeomorphism between two open sets A and B of R%. In 


other words, g : A — B will be continuously differentiable and invertible, and 


gy! : B — A will be continuously differentiable as well. It is useful to recall 


that, by Theorem 2.11, a diffeomorphism transforms open sets into open sets and 
closed sets into closed sets. Moreover, by Theorem 10.26, for every point w € A 
the Jacobian matrix J g(w) is invertible: We have 
det Jo(u) 40. 
From now on, we will often use a different notation for the Jacobian matrix: 
instead of J y(1u) , we will write gp! (u) . 


We will also need the following property. 


Lemma 11.38 Let A C R™ be an open set andy : A > RN aC!-function; if S is 
a subset of A of the type 


S = [a1,b1] x +++ x [an-1, by-1] x {c}, 
then p(S) is negligible. 
Proof For simplicity, let us concentrate on the case of a subset of R? of the type 
S = [0, 1] x {0}. 


For any positive integer n, consider the rectangles (actually squares) 


a k-1k 1 1 
a non . 2n’ 2n |’ 
with k = 1,...,n. For n large enough, they are contained in a rectangle R, which 
itself is contained in A. Since R is a compact set, there is a constant C > 0 such that 


|’ (a) || < C for every u € R. By the Mean Vale Theorem 10.20, g is “Lipschitz 
continuous” on R with Lipschitz constant C, ie., 


Ip(u) —g(v)|| < Clu—v||, foreveryu,ve R. 


Since the sets Jx,, have as diameter i V2, the sets @(Jk,n) are surely contained in 
some squares dex whose sides’ lengths are equal to £2. We then have that g(S) 
is covered by the rectangles dea and 


11.9 Change of Variables in the Integral 335 


n 2 2. 
2 WU (Skn) <n (<v3) = EG ‘ 


n 
k=1 


Since this quantity can be made arbitrarily small, the conclusion follows from 
Corollary 11.23. | 


As a consequence of the foregoing lemma, it is easy to see that the image of the 
boundary of a rectangle through a diffeomorphism 9g is a negligible set. In particular, 
given two nonoverlapping rectangles, their images are nonoverlapping sets. 

We are now ready to prove a first version of the change of variables formula in 
the integral, which will be generalized in a later section. 


Theorem 11.39 (Change of Variables Theorem—I) Let A and B be open subsets 
of RN and g : A — B a diffeomorphism. If f : B — R is a continuous function, 
then, for every compact subset D of A, 


f(w)de = [ F(o(es)) | dete’ (u)| des. 


g(D) 


Proof Note first of all that the integrals in the formula are both meaningful, since 
the sets D and g(D) are compact and the considered functions continuous. We will 
proceed by induction on the dimension JN. Let us first consider the case N = 1. 

First, using the method of integration by substitution, one verifies that the 
formula is true when D is a compact interval [a, b]: It is sufficient to consider 
the two possible cases in which @ is increasing or decreasing and recall that every 
continuous function is primitivable. For instance, if g is decreasing, then we have 
v(La, b]) = [y(2), g(a)], so that 


g(a) 
/ f(x) dx =i f(x)dx 
g([a,b]) pb) 
-| f (p(u))g' (u) du 
b 
= / f(~W)\¢"(w)| du 
=} f(gu))\¢"u)| du. 
[a,b] 


Now let R be a compact subset of A whose interior R contains D. Since both f 
and (f 0 g)|qg’| are continuous, they are integrable on the compact sets g(R) and R, 
respectively. The open sets Rand R \ D can each be split into a countable union of 
nonoverlapping compact intervals whose images through ¢ also are nonoverlapping 
close intervals. By the complete additivity of the integral, the formula holds true for 


336 11. The Integral 
Rand R \D: 


/ feddx =f Femple'wlaw, 
Q R 


(R) 


[, fede =f F(g@))1g"(u)| du . 
g(R\D) R\D 


Hence, 


/ pods =f _  fx)dx 
y(D) g(R\(R\D)) 


— ; fla) dx — | . f(x) dx 
g(R\D) 


g(R) 


= : f(gu))|¢’(u)| du — | f (y(u))|9"(u)| du 
R R\D 


= i: f(eu))\e'W)| du, 
D 


so that the formula is proved in the case N = 1. 

Assume now that the formula holds for the dimension JN, and let us prove that 
it also holds for N + 1.' Once we fix a point &@ € A, at least one of the partial 
derivatives ae (a) is different from zero. We can assume without loss of generality 


that it is sou =r (w) # 0. Consider the function 


a(uj,...,UN+1) = (U1,..-, UN, NGI, -.-, UN41))- 
Since deta’(u) = sou (uw) #4 0, by Theorem 10.24 we have that @ is a 


diffeomorphism between an open neighborhood U of u and an open neighborhood 
V of a(w). Assume first that D is contained in U, and set D = a(D). 
We define on V the function 8 = gy o a~!, which is of the form 


BO, ---, UN+1) = (Bi (1, ---, UN41), +++, BN(UL, «+s UN41), UN4I) 5 
where, for 7 = 1,..., N, we have 
Bj, ---, N41) = G/U, ---, UN, [Ong (U1, ---, UN, I (un41))- 


Such a function f is a diffeomorphism between the open sets V and W = g(U). 


' Ata first reading, it is advisable to consider the transition from N = 1 to N +1 =2. 


11.9 Change of Variables in the Integral 337 


Consider the sections 
V, = {(11,..-, UN): (U1,..., UN, t) € V} 
and the projection 
PyaiV ={t:V; 4 QD}. 
For t € Py+1V, define the function 
Bi(v,..-, UN) = (Bi (41, .--, UN, t),.--, Bn(U1,---, UN, F)), 


which happens to be a diffeomorphism defined on the open set V; whose image is 
the open set 


W, = {(41,...,Xn) : (41,...,xn, 0) © W}. 
Moreover, det 6/(v1,..., vy) = det B’(v1,..., vy, ¢). Consider also the sections 
D; = {(vy1,..., vn): (Y,..., UN, 1) € D} 
and the projection 
PyiiD = {t: D, O}. 
Analogously, we consider 6 (D), and Py+1 B(D). By the definition of 6, we have 
B(D): = BD), Py1B(D) = PD. 


Using the Reduction Theorem 11.34 and the inductive assumption, we have 


- f=) o i " Fost. oss AN et) dx dx) dt 
B(D) Py+iB(D) Br (Dr) 


al ,( = (Bub. -s byt) [det C01 5... vy) | d0y +. .dey) dt 


41D Dr 


=f ,( Pe FOB. 0N.1) |e"... UD) doy.doy ) at 


41D D; 


= iL F(B(w)) [det p'(w)| dv. 


Consider now the function ra : V = R defined as 


f (wv) = f(B(v)) | det B’(v)|. 


338 


Define the sections 


and the projection 


sais 


In an analogous way we define a(D)y,,....uy and Pi 
sets, and by the definition of a, we have 


:(u4,.. 


Moreover, for every (u1,...,un) € Pi 


io Qn+i1(U4,.- 


ae uy = ON+1(U1,---, UN, Du, 


., UN, UN+1) € D} 


nD={(u4,..., UN) : Du, peeey 


uy) ’ 


.,UN,t) 


see 


uN + O}. 


, 


Pi,...wa(D) = P| 


11 


nD, the function defined by 


The Integral 


na(D). They are all closed 


Reaeg 


is a diffeomorphism of one variable between the open sets Uy, ....uy and Vuj,....uy> 
sections of U and V, respectively. Using the Reduction Theorem | 1.34 and the one- 
dimensional change of variables formula proved earlier, we have that 


OQN+41 
(u1,...,UN+1) 
OuUN+1 


=| f(a(u)) | deta’ (w)| du. 
D 


Hence, since gy = B oa, we have 


gD) 


fv, icing uss) dows) dv,...dun 


N 


fade = | . f(x)dx 
p(D) 


.--, UN, ONLI, 


dunt) du,...duy 


...,UN41))* 


7 [, Fey laerp' ew} aw 


= f(v)dv 


a(D) 


fr... ova) dower) dvu,...dvy 
) 


11.9 Change of Variables in the Integral 339 


= / f(a(u)) | deta’ (u)| du 

D 
- [ f (B(a(u))) | det B’ (o(w))| | det a” (a)| dus 
= [ f(y(u)) | det y’(u)| dw. 


We have then proved that, for every u € A, there is a d(w) > 0 such that the thesis 
holds true when D is contained in B[w, 5(w)]. A gauge 6 is thus defined on A. By 
Lemma 11.20, we can now cover A with a countable family (J;)x of nonoverlapping 
rectangles, each contained in a rectangle of the type B[ wu, 5(21)], so that the formula 
holds for the closed sets contained in any of these rectangles. 

At this point let us consider an arbitrary compact subset D of A. Then the formula 
holds for each DM Jy, and, by the complete additivity of the integral and the fact 
that the sets ¢(D MN J;) are nonoverlapping (as a consequence of Lemma 11.38), we 
have 


[, f(a) dx = pe f(a) dx 


K Ye(DNIK) 


=f recwy dery'n| du 
k DOSr 


=i f(g(u)) |dety’(u)| du. 
D 
The theorem is thus completely proved. a 


Remark 11.40 The change of variables formula is often written, setting p(D) = E, 
in the equivalent form 


: f(a) de = / F(o(uy) [det g'(u)| dee. 
E pg \(B) 


Example Consider the set 
E=({@,y)€R?:-l<xe1x2syex+), 
and let f(x, y) = x*y be a function on it. Defining g(u, v) = (u,v + u2), we have 


a diffeomorphism with det y’(u, v) = 1. Since g'(E) = [-1, 1] x [0, 1], by the 
change of variables formula and the use of the Fubini reduction theorem we have 


i il iL uz 11 
2 _ 2 2 _ 4 _ 
[ Pvaray= f ([ @o+rav) au= | (Sut) aw=Z. 
E -~1 \Jo =f \ 2 15 


340 11. The Integral 


11.10 Change of Measure by Diffeomorphisms 


In this section we study how a measure is changed by the action of a diffeomor- 
phism. 


Theorem 11.41 Let A and B be open subsets of R“, and let gy : A > B bea 
diffeomorphism. Let D © A and g(D) C B be bounded sets. If D is measurable, 
then y(D) is measurable, | det y'| is integrable on D, and 


wo(D)) = f \aerg' wld. 


Proof By the preceding theorem, the formula holds true whenever D is compact. 
Since every open set can be written as the union of a countable family of 
nonoverlapping (closed) rectangles, by the complete additivity and the fact that A is 
bounded, the formula holds true even if D is an open bounded set. 

Assume now that D is a measurable bounded set whose closure D is contained 
in A. Let R be a compact subset of A whose interior R contains D. Then there is a 
constant C > 0 such that | det g’(w)| < C for every u € R. By Proposition 11.22, 
for every € > 0 there are two finite or countable families (J,) and (J), each made 


of nonoverlapping rectangles contained in R such that 


R \(U4) sesUe. u((U 4) 9 (U4i)) se. 


k k k 


Since the formula to be proved holds on both the open bounded sets and the compact 
sets, it certainly holds on each rectangle J; and J/; then it holds on UzJ; and on 


Uk oh and since it holds even on R, it must be true on R \ (Uk Jy) as well. Thus, we 
have that @(Ux J) and g(R \ (Uk JD) are measurable, 


0o(& \(U4)) sem < o(U ), 


n(o(Wa)) -afoC\(W49)) = 


= [det yuyu — | det y’ (uw) | du 
Ur Jk R\Uxd}) 


a / | det p’(w)| du 
(Uk SOUR I) 


11.10 Change of Measure by Diffeomorphisms 341 


= cu((U)0(U2) 


k 
<Ce. 


Taking ¢ = 5 we find in this way two sequences D, = UkgJk.n and Di, = 
R\ (Ux J, kn) with the aforementioned properties. By Proposition 11.22, we have that 
g(D) is measurable and (y(D)) = limy w(~(Dn)) = limy, 4 (y(Dj,)). Moreover, 
since xp, converges almost everywhere to xp, by the Dominated Convergence 


Theorem 9.13, we have that 
M(p(D)) = lim n(g(Dn)) 

= tim f | det g’(w)| du 
n Dn 

= lim i | det y"(u)|xp, (te) du 
n R 

=) | det y’(w)|xp(u) du 
R 

=i | det yp’ (w)| du. 
D 


We can now consider the case of an arbitrary measurable bounded set D in A. 
Since D is bounded, there is an open ball B(0, p) containing it. Let A’ = ANB(O, p) 
and B’ = (A’). Since A’ is open and bounded, as in the proof of Proposition 11.11, 
we can consider a sequence of nonoverlapping rectangles (K,), whose union is 
equal to A’. The formula holds for each of the sets DM Ky, by the foregoing 
considerations. The complete additivity of the integral (Theorem 11.28) and the 
fact that A’ is bounded then lead us to our conclusion. a 


Example Consider the set 
E={(x,y) €R*:x <y < 2x, 3x2 < y < 4x7}. 


We see that E is measurable since it is an open set. Taking 


u u2 
glu, v) = (- ’ ) ’ 
Vv Vv 


we have a diffeomorphism between the set D =]1, 2[ x ]3,4[ and E = g(D). 
Moreover, 


l/v —u/v* ) a uw 


det g’ = det 
i as ee —u? /v* 


342 11. The Integral 


Applying the formula on the change of measure and the Reduction Theorem 11.34, 
we have that 


ay a ae a 74 49 
E)= —dv)du= | ~—u'du=—. 
ME) i (/ a v) u / TT ieee 


11.11 The General Theorem on Change of Variables 


We are now interested in generalizing the Change of Variables Theorem 11.39 
assuming f is not necessarily continuous but only L-integrable on a measurable set. 
To do this, it will be useful to prove the following important relationship between the 
integral of a function having nonnegative values and the measure of its hypograph. 


Proposition 11.42 Let E be a measurable bounded set and f : E — Ra bounded 
function with nonnegative values. Let G ¢ be the set thus defined: 


G; ={(@, the ExR:0<t< f(a}. 


Then f is integrable on E if and only if G ¢ is measurable, in which case 


wGp= | f. 
E 


Proof Assume first that G ¢ is measurable. By Fubini’s Reduction Theorem 11.36, 
since PG = E, the sections being (Gs)2 = [0, f(@)], we have that the function 


Le FF a 1 = f(a) is integrable on E and 


f(a) 
wiGy= | i= | (/ \dr)az = f f(a) da. 
Gr E 0 E 


Assume now that f is integrable on E. Let C > 0 be a constant such that 0 < 
f(x) < C for every x € E. Given a positive integer n, we divide the interval 
[0, C] into n equal parts and consider, for 7 = 1,...,”, the sets 


=j ; 
Bh ={eee: dc < sm <écl; 
n n 


as a consequence of Lemma 11.16, they are measurable and nonoverlapping, and 
their union is E. We can then define on E the function yy, in the following way: 


n. 
J 
Wn = dX 7 oXel ’ 
J= 


11.11 The General Theorem on Change of Variables 343 


and so 


By Proposition 11.22, it is easy to see that, since the sets Ej are measurable, the 


sets E} x [0. LC | are, too. Consequently, the sets Gy,, are measurable. Moreover, 


since 
Gy = () Gy, ? 
n>1 
even G ¢ is measurable, and the proof is thus completed. | 


We are now in a position to prove the second version of the theorem on the change 
of variables in the integral. 


Theorem 11.43 (Change of Variables Theorem—II) Let A and B be open 
subsets of R™, and let g : A — B be a diffeomorphism. Let D C A and g(D) C B 


be measurable bounded sets and f : p(D) — Ra function. Then f is L-integrable 
on g(D) if and only if (f 0 y) | det’ | is L-integrable on D, in which case 


fade = | (g(a) |dery' (wl ae. 


p(D) 
Proof Assume that f is L-integrable on E = y(D). We first consider the case 
where f is bounded with nonnegative values. 
Let C > Obe such that 0 < f(x) < C forevery x € E. We define the open sets 
A=Ax]-C,C[, B=Bx]-C,C[ 
and the function @ : A — B inthe following way: 


P(u1, ++, Un, Tt) = (gi(41, .++5Un), ees Qn (UI, .++,Un),t). 


This function is a diffeomorphism, and det @’(u, t) = det gy’ (w) for every (wu, t) € 
A. Let Gf be the hypograph of f : 


Gr={(@,the ExR:0<t< f(@)}. 


Since f is L-integrable and EF is measurable, by the preceding proposition we have 
that Gr is a measurable set. Moreover, 


@ (Gr) ={(u,t)h €eDx R:0<t< f(g(u))}. 


344 11. The Integral 


Using Theorem 11.41 and Fubini’s reduction theorem, we have 
u(Gf) = | det @ (aw, t)|dudt 
@ (Gf) 


= [ | det y’ (uw) | dudt 
o-\Gp) 


f(p(u)) 
= is (/ |deto' wat) du 
D 0 


= | focw | dete! (wae, 


On the other hand, by Proposition 11.42, we have that u(Gy) = Soro) f, and this 
proves that the formula holds in the case where f is bounded with nonnegative 
values. 

In the case where f is not bounded but still has nonnegative values, we consider 
the functions 


fn(&) = min{ f(a), n}. 
For each of them, the formula holds true, and using the Monotone Convergence 
Theorem 9.10 we prove that the formula holds for f even in this case. 
When / does not have nonnegative values, it is sufficient to consider its positive 
and negative parts, apply the formula to them, and then subtract. 
To obtain the opposite implication, it is sufficient to consider (f © y) | det ¢’| 


instead of f and y~! instead of g and to apply what was just proved. 


We recall here the equivalent formula 


; f(w)de = i: F(o(w)) |deto!(u)| de. 
E gy !(E) 


11.12 Some Useful Transformations in R2 


Some transformations do not change the measure of any measurable set. We 
consider here some of those that are most frequently used in applications. 


Translations We call translation by a given vector a = (aj,a2) € R? the 
transformation g : R? — R? defined by 


g(u, v) = (u+a,,v+az). 


11.12 Some Useful Transformations in R2 345 


> 
x 


It is readily seen that gy is a diffeomorphism, with detg’ = 1, so that, given a 
measurable bounded set D and an L-integrable function f on g(D), we have 


i fle. yydedy =f fu taru+ay)dudv. 
g(D) D 


Reflections A reflection with respect to one of the cartesian axes is defined by 
p(u, v) = (—u,v), or gy(u, v) = (u, —v). 


Here, det y’ = —1, so that, taking for example the first case, we have 


/ fx. yydedy= | f(—u,v)dudv. 
g(D) D 


346 11. The Integral 


> 
x 


Rotations A rotation around the origin by a fixed angle a is given by 
g(u, v) = (ucosa — vsina, usina + vucosa@). 
It is a diffeomorphism, with 


cosa —sina 


det y'(u, v) = det ( ) = (cosa)” + (sina)? = 1. 


sin a cosa 


Hence, given a measurable bounded set D and an L-integrable function” f on g(D), 
we have 


fis. y)dedy= | f(ucosa — vsina, usina + vcosa)dudv. 
gD) D 


Homotheties A homothety of ratio w > 0 is a function g : R? > R? defined by 
gp(u, v) = (au, av). 


It is a diffeomorphism, with det gy’ = a, Hence, 


fis. y)dedy =a? | f(au,av)dudv. 
g(D) D 


? Let us mention here that Buczolich [2] found an ingenious example of an integrable function in 
R? whose rotation by w = 77/4 is not integrable. This is why we have restricted our attention only 
to L-integrable functions. 


11.12 Some Useful Transformations in R2 347 


Polar Coordinates Another useful transformation is provided by the function w : 
[0, +oof x[0, 27[—> R? given by 


W(p, 8) = (pcos, psing), 


cae 


which defines the so-called “polar coordinates” in R*. Consider the open sets 
A=]0,+00[ x ]0,2x[, B= R*\ ({0, +oof x{0}). 


The function g : A — B defined by g(p,0) = w(p,0) happens to be a 
diffeomorphism, and it is easily seen that 


dety’(p,0) =p  forevery(p,0) EA. 
Let EC R2 be a measurable bounded set, and consider a function f: ER. We 
can apply the Change of Variables Theorem 11.43 to the set E = EM B. Since E 


and g7! (E ) differ from E and y~!(E), respectively, by negligible sets, we obtain 
the following formula on the change of variables in polar coordinates: 


[ fendeay=f rere.onpapas. 
E Wl(E) 
Example Let f(x, y) = xy be defined on 


E = ({(x, y)€R?:x>0,y>0,x7+ y? <9}. 


By the formula on the change of variables in polar coordinates, we have y~!(E) = 
[0, 3[ x[0, ah by the Reduction Theorem 11.36, we can then compute 


m/2 3 z ; 81 m/2 ; 81 
— p- cos@ sinéd dp | dd = — cosé sind dé = — . 
E 0 0 4 Jo 8 


348 11. The Integral 


11.13 Cylindrical and Spherical Coordinates in R? 
We consider the function € : [0, +oo[ x[0, 27[ xIR > R? defined by 


&(0, 9, 2) = (pcosd, psin6, z), 


fap) 
me) 


which gives us the so-called “cylindrical coordinates” in R*. Consider the open sets 


A =]0, +o0[ x ]0, 2z[ xR, 
= (R? \ ({0, +oof x{0})) xR. 


The function g : A — B defined by g(p,6,z) = &(p,6,z) happens to be a 
diffeomorphism, and it is easily seen that 


dety’(p,0,z) =p, forevery(p,0,z)€A. 
Let E C R?* be a measurable bounded set, and consider a function nf:E— R. We 
can then apply the Change of Variables Theorem 11.43 to the set E = EB. Since 


E and go! (E ) differ from E and ~!(E), respectively, by negligible sets, we obtain 
the following formula on the change of variables in cylindrical coordinates: 


/ f(x,y, z) dx dy dz = FS (E(pe, 9, z))o dp dé dz. 
E é-l(E) 


Example Let us compute the integral [- gf, where f(x, y,z) = x* + y? and 


Ba{(a,.,26R 2? + =1L0e2 <2 oye. 


11.13 Cylindrical and Spherical Coordinates in R* 349 
Passing to cylindrical coordinates, we notice that 
pcosé + psind + V2 > 0 


for every 6 € [0,2z[ and every p e€ [0,1]. By the Change of Variables 
Theorem 11.43, using also Fubini’s Reduction Theorem 11.36, we compute 


[02 +» axdyaz= [ p> dp do dz 
E é-1(E) 


1 Qn pcos0+p sino+/2 
“LUG, pas)a0) 
0 0 0 


1 20 
=i (/ p(peos + psind + V3).d0) dp 
0 \Jo 


= 2x [ pV 2dp = 


Now consider the function o : [0, +oo[ x[0, 2[ x[0, 7] > R? defined by 


o(p,9,) = (psingcos@, psing sind, pcos®@), 


which defines the so-called “spherical coordinates” in R*. Consider the open sets 


A = ]0, +oo[ x JO, 2z[ x ]0, z[, B =R? \ ({0, +oo[ x {0} x R). 


350 11. The Integral 


The function g : A — B defined by g(p,0,¢) = o(p,6@,¢) happens to be a 
diffeomorphism, and it can be easily checked that 


dety'(p,0,¢) = —p’ sing, for every (0,0,z) EA. 
Let EC R? be a measurable bounded set, and consider a function fiEo R. We 
can then apply the Change of Variables Theorem 11.43 to E = EM B. Since E and 


yg! (E) differ from E and o~!(E), respectively, by negligible sets, we obtain the 
following formula on the change of variables in spherical coordinates: 


i f(x,y, 2) dx dydz =a f(o(0, 0, 6))p° sing dp dé do. 
E o-!(E) 
Example Let us compute the volume of the set 
E= {ony eR x24 P42? < Lez yey ; 
We have 
wey = f ldxdydz 
E 


= / p’ sing dp do dd 
o (BE) 


1 m/4 20 
= (/ (/ p? sing d® a) dp 
0 0 0 
1 1/4 
= 20 f (| p? sind dé ) dp 
0 0 
v2\ f° x J2\ 21 


11.14 The Integral on Unbounded Sets 


When dealing with unbounded domains, there are good reasons to limit our attention 
only to L-integrable functions. This section extends the theory of the integral to this 
context. 

Let E bea subset of R, not necessarily bounded, and assume first that f : E > 
R is a nonnegative function, i.e., 


f(x) => 0, foreveryrwe E. 


11.14 The Integral on Unbounded Sets 351 


As usual, we will use the notation 
B[0,r] =[—r,r] x --- x [-rr]. 


If f is integrable on EM B[0, r] for every r > 0, we define 


i f= tlm f 
E r> +00 JENB(O,r] 


Notice that this limit always exists since the function r +> EAB(0.r] f is increasing, 
because of f > 0. When this limit happens to be finite (i.e., not equal to +00), we 
will say that f is “integrable” (on E). 

It can be easily seen that the same result is obtained if, instead of B[O, r], we take 
the Euclidean close balls B(0, r). This is due to the fact that, for every r > 0, 


B(O,r) < BlO,r], and B[0,r] < BO,rVN). 


The same observation can be made, of course, for many other families (S,) of 
bounded sets invading R%, meaning that for every r > 0 there exists r’ > 0 such 
that 


Bl0,r] CS). 


In the case where the function f also has negative values, we consider both its 
positive part ft = max{f,0} and its negative part f— = max{—/, 0}, so that 
f = f* — f7~.Notice that f* > 0 and f~ > 0. We say that f is L-integrable if 
both f* and f7~ are integrable, in which case we define 


frebr-ke 


Notice that, in this case, since | f| = f* + f~, we have that 


fis frre fir 


The fact that | f| is integrable justifies the name “L-integrable” for the function f. 

It is not difficult to prove that the set of L-integrable functions is a real vector 
space, and the integral is a linear function on it which preserves the order. Moreover, 
we can easily verify that a function f is L-integrable on a set E if and only if the 
function fg is L-integrable on R™. 


Definition 11.44 A set E C RY is said to be “measurable” if E  B[0,r] is 
measurable for every r > 0. In that case, we set 


w(E) = tim WEN B[0, rl). 


352 11. The Integral 


Notice that j4(E), in some cases, can be +00. It is finite if and only if the constant 
function | is L-integrable on F, i.e., the characteristic function of E is L-integrable 
on R’ . The properties of measurable bounded sets extend easily to unbounded sets. 
In particular, all open sets and all closed sets are measurable. 

The Monotone Convergence Theorem of Beppo Levi attains the following 
general form. 


Theorem 11.45 (Monotone Convergence Theorem—II) We are given a function 
f and a sequence of functions fy, with n € N, defined almost everywhere on a 
subset E of RN, with real values, verifying the following conditions: 


(a) The sequence (fn)n converges pointwise to f, almost everywhere on E. 
(b) The sequence (fy)n is monotone. 

(c) Each function f, is L-integrable on E. 

(d) The real sequence (fe Fadn has a finite limit. 


Then f is L-integrable on E, and 


[rato f 


Proof Assume, for definiteness, that the sequence (fn)n is increasing. By consid- 
ering the sequence (f, — f0)n instead of (fn)n, we can assume without loss of 
generality that all the functions have almost everywhere nonnegative values. Let 
J = im,(/, Jn); for every r > O we can apply the Monotone Convergence 
Theorem 9.10 on the bounded set EM B[0, r], so that f is integrable on E NBO, r] 
and 


f = lim fa slim, [ fe= J. 
= nO J ECB(O,r] " ~ noo E 


Let us prove that the limit of /; ECB(0.r] f exists, as r — +00, and that it is equal 
to 7. Fix « > 0; there is an € N such that, forn > n, 


since, moreover, 


[ w= lim _ Si, 
E T> +00 J EOBIO,r] 


there is ar > 0 such that, forr > r, 


g-e<f fas ld. 
EQB(0,r] 


11.14 The Integral on Unbounded Sets 353 


Then, since the sequence (f,)n iS increasing, we have that, for every n > n and 
everyr >T, 


g-e<| _ Stn <J. 
ENB[0,r] 


Passing to the limit as nm — +00, we obtain, for every r > 7, 


g-e<f ee ee 
ENB(0,r] 


The proof is thus completed. | 


As an immediate consequence, there is an analogous statement for the series of 
functions. 


Corollary 11.46 We are given a function f and a sequence of functions fx, with 
k € N, defined almost everywhere on a subset E of RN, with real values, verifying 
the following conditions: 


(a) The series )°, fx converges pointwise to f, almost everywhere on E. 
(b) For every k € N and almost every x& € E, we have fx(x) => 0. 

(c) Each function f, is L-integrable on E. 

(d) The series }°, Sr Sk) converges. 


Then f is L-integrable on E and 


[r-L fw 


From the Monotone Convergence Theorem 11.45 we deduce, in complete 
analogy with what we have seen for bounded sets, the Dominated Convergence 
Theorem of Henri Lebesgue. 


Theorem 11.47 (Dominated Convergence Theorem—II) We are given a func- 
tion f and a sequence of functions f,, with n € N, defined almost everywhere on a 
subset E of IRN, with real values, verifying the following conditions: 


(a) The sequence (fn)n converges pointwise to f, almost everywhere on E. 
(b) Each function fn is L-integrable on E. 


(c) There are two functions g, h, defined almost everywhere and L-integrable on E, 
such that 


8(@) < fn(@) < h(x) 


for everyn € N and almost every 2 € E. 


354 11. The Integral 


Then the sequence Gh Sn)n has a finite limit, f is L-integrable on E, and 


[rain ft. 


As a direct consequence we have the complete additivity property of the integral 
for L-integrable functions. 


Theorem 11.48 Let (E;) be a finite or countable family of pairwise nonoverlap- 
ping measurable subsets of RN whose union is a set E. Then f is L-integrable on 


E if and only if the following two conditions hold: 


(a) f is L-integrable on each set Ex . 


(b) Dk Su, |\f(@)| dx < +00. 


In that case, we have 
f= f. 
[ram fe 


As another consequence, we have the Leibniz rule for not necessarily bounded 
subsets Y of R%, which is stated as follows. 


Theorem 11.49 (Leibniz Rule—II) Let f : X x Y — R be a function, where X 
is a nontrivial interval of R containing xo, and Y is a subset of R™, such that: 


(a) For every x € X, the function f(x,-) is L-integrable on Y, so that we can 
define the function 


ro) = ff foway. 


(b) For every x € X and almost every y € Y the partial derivative H (x, Y) exists. 
(c) There are two L-integrable functions g,h : Y — R such that 


0 
ey) < Lox. y) < ny) 
x 
for every x € X and almost every y € Y. 


Then the function H (x, -), defined almost everywhere on Y, is L-integrable there, 
the derivative of F in xo exists, and 


0 
eo = [ (Feo. w) ay. 


11.14 The Integral on Unbounded Sets 355 


Also the reduction theorem of Guido Fubini extends to functions defined on a not 
necessarily bounded subset E of RY. Let N = Nj +N, and writeRY = R!xR2, 
For every (x, y) € R™! x R?, consider the “sections” of E: 


Ex ={yeR™”:(@,y)€ FE}, Ey={eeR™: (@,y) € E}, 
and the “projections” of E: 
PE={zeR": EF, 4B}, PrE={yeR”: £, 4}. 
We can then reformulate the theorem in the following form. 


Theorem 11.50 (Reduction Theorem—IV) Let f : E — R be an L-integrable 
function. Then: 


(a) For almost every x € P\E, the function f (x, -) is L-integrable on the set Ex . 

(b) The function © t> Sew f(a, y) dy, defined almost everywhere on PE, is 
L-integrable there. 

(c) We have 


fr=fee " fe. way) de. 


Analogously, the function y > Sey f(a, y) dx, defined almost everywhere on 
P2E, is L-integrable there, and we have 


f= fname 


Proof Consider for simplicity the case Nj) = Nz = 1, the general case being 
perfectly analogous. Assume first that f has nonnegative values. By Fubini’s 
Reduction Theorem 11.36 for bounded sets, once r > 0 is fixed, we have that, 
for almost every x € P\E / [-,r,r], the function f(x,-) is L-integrable on 
E,(—r, r]; the function g; (x) = SEA rr Ff (x, y) dy, defined almost everywhere 
on P| E 1 [—r, r], is L-integrable there, and 


i: oo, a(x) dx. 
EQB(0,r] PiEN-rr] 


. etde < / 2 
P, EN[-r,r] E 


In particular, 


356 11. The Integral 


so that if 0 < s <r, then we have that g, is L-integrable on P| E N[—s, s], and 


/ ears ff 
P, EN[—s,s] E 


Keeping s fixed, we let r tend to +00. Since f has nonnegative values, g,(x) will 
be increasing with respect to r. Consequently, for almost every x € P}E MN [s,s], 
the limit lim,_, +00 g, (x) exists (possibly infinite), and we set 


g(x)= lim g-(x)= lim f(x, y)dy. 
r—-+00 r—+0oo ExNl-r,r] 


Let T = {x € P}EN[-—s,5] : g(x) = +00}; let us prove that T is negligible. 
We define the sets 


Ej = {x € PEO [s,s]: gr(x) > n}. 


By Lemma 11.16, these sets are measurable sets and 


1 1 
L(E;,) < ~ | gr(x)dx < ~ | Ts 
1 JP, EN[-s,s] NJE 


Hence, since the sets E/ increase with r, the sets F, = U, E/, are also measurable, 
and we have that w(F,) < fe f. Since T C My,Fy, we deduce that T is 
measurable, with u(T) = 0. 

Hence, for almost every x € P}E O[-—s, 5], the function f(x, -) is L-integrable 
on the set E,, and, by definition, 


; f(x, y)dy = g(x). 


Moreover, if we take r in the set of natural numbers and apply the Monotone 
Convergence Theorem to the functions g,, it follows that g is L-integrable on 
P,EO[-s,s], and 


i g= lim gr, 
P\EN-s,s] r>O JP EN[—s,5] 


i: ( fs»)dy) dx = ff. 
P\ EN[—s,s] \J Ex E 


so that 


11.14 The Integral on Unbounded Sets 357 


Letting now s tend to +00, we see that the limit 


lim (J fs. )dy) 7 
s>+00 J Py EQ[—s,5] Ex 


exists and is finite; therefore, the function x te rf rg. f(&%, y) dy, defined almost 
everywhere on P| E, is L-integrable there, and its integral is the preceding limit. 
Moreover, from the inequality proved earlier, passing to the limit, we have that 


/ (/ fr yddy) ax = ff, 

P\E \JEy E 

On the other hand, 
| ee | (| fir y)dy ds 
ENB[0,r] Pi EN[-r,r] ExA[-r,r] 


< | ( Peas) ay as 
Pi EN[-r,r] Ey 


< | ( fl y)dy) ax, 
PLE \JEy 


so that, passing to the limit as r > +00, 


Le felfsre)e 


In conclusion, equality must hold, and the proof is thus completed in the case where 
f has nonnegative values. In the general case, just consider f* and f~, and subtract 
the corresponding formulas. | 


The analogous corollary for the computation of the measure holds. 


Corollary 11.51 Let E be a measurable set. Then E has a finite measure if and 
only if: 


(a) For almost every x € P\E the set Ex is measurable and has a finite measure. 

(b) The function x +> p(E,), defined almost everywhere on PE, is L-integrable 
there. 

(c) We have 


UE) =f H(Ex) dx. 
P\E 


358 11. The Integral 


With a symmetric statement, if E has a finite measure, we also have 
y(E) = / p(Ey) dy. 
PoE 


The change of variables formula also extends to unbounded sets, with the same 
statement. 


Theorem 11.52 (Change of Variables Theorem—III) Let A and B be open 
subsets of RN and g : A — B a diffeomorphism. Let D © A be a measurable 
setand f : p(D) > Ra function. Then f is L-integrable on y(D) if and only if 
(f og) |det¢’| is L-integrable on D, in which case 


Sw) dee = [ F(o(u)) [det g (u)| dee. 


g(D 


Proof Assume first that f is L-integrable on E = g(D) with nonnegative values. 
Then, for every r > 0, 


i Fpl) | detg/(as)| du af = fade = | f(w)de, 
DOB[0,r] g(DNB[0—,r]) g(D) 
so that the limit 

lim F(p(w)) | deto!(u)| du 

r> +00 J DAB[O,r] 


exists and is finite. Then (f 0 g) | detg’| is L-integrable on D, and we have 
/ f(g(u)) | detg’(u)| du < / f(x)dx. 
D gD) 


On the other hand, for every r > 0, 


= Pah : (Foy) idee! < f (f op) dete’), 
ENB[0,r] g-!(ENB[0,r]) go !(E) 


so that, passing to the limit, 


/ f(w)dx = lim fla)de < / f(p(u)) |dety’(w)| dw. 
E g-'(E) 


r> +00 JECB[O,r] 


The formula is thus proved when f has nonnegative values. In general, just proceed 
as usual, considering f* and f~. 


11.14 The Integral on Unbounded Sets 359 


To obtain the opposite implication, it is sufficient to consider (f o g) | det¢’| 
instead of f and g~! instead of g and to repeat the preceding argument. | 


Concerning the change of variables in polar coordinates in R? or in cylindrical or 
spherical coordinates in RR, the same type of considerations we made for bounded 
sets extend to the general case as well. 


Example Let E = {(x, y) € R2 :x2+4+ y? > I} and f(x,y) = (x2 + y2)-®, with 
a > 0. We have 


|= dxd C= d ) ao 2 he ie 
x = = 470 . 
pe rye fy 1 pea? ;. 


It is thus seen that f is integrable on E if and only if a > 1, in which case the 


. a 
integral is | . 


Example Let us compute the three-dimensional measure of the set 


1 
B={inygeR ire tyre sil. 
x 


Using Fubini’s Reduction Theorem 11.50, grouping together the variables (y, z) we 
have 


+00 1 
we) = | m—dx=T. 
1 X 


Example Consider the function f(x, y) = en orty ay and let us make a change of 
variables in polar coordinates: 


2,2 2a “Hoa 2 1 ste 
i. e F4+Y dx dy = / (/ e? p do) dQ =2n |-5e" =7. 
R2 0 0 2 0 


Notice that, using Fubini’s Reduction Theorem 11.50, we have 


a(t?) Too we 8 
e Y? dxdy = e~e” dx )dy 
R2 —0o —00 


360 11. The Integral 


and we thus find again the formula 


+00 2 
/ e* dx=J/n. 


—oo 


11.15 The Integral on /-Surfaces 


We now want to define the integral of a function f : U — R on an M-surface 
o : I > R% whose image is contained in U. 

When the indices i;,..., ij vary in the set {1,..., N}, then for every u € I we 
can consider the M x M matrices obtained from the Jacobian matrix Jo (w) (also 
denoted by o’(w)) by selecting the corresponding lines, i.e., 


Io; 0%, 

In (W) 38 Tag (U) 
i,t) (W) = Bet 

diy dCi 

Fn (WU) Tig (u) 


and we can define the vector X(u) in Ri) as 


E(u) = (det Oli, iy) 


ener : 


Definition 11.53 The function f : U — R is “integrable” on the M-surface o : 
I > RY if (f oc)||X|j is integrable on /. In that case, we set 


ie, = | sow) | U(u)|| dw. 


In the case M = 1, wehaveacurvea : [a, b] > RY and, given a scalar function 
f defined on the image of o, 


b 
[r= f rewyto'ora. 


If M = 2 and N = 3, we have the surface o : [a, b1] x [a2, bo] > R? and, 
given a scalar function f, defined on the image of o, it can be checked that 


b b 
[ref [sou Fun x Za dina 
oO a2 aj du dv 


11.15 The Integral on M-Surfaces 361 


It is important to see what happens to the integral when we have two equivalent 
M-surfaces. 


Definition 11.54 Two M-surfaces 0 : 1 > R% anda : J > RN are said to be 
“equivalent” if they have the same image and there are two open sets A C J, B C J, 
and a diffeomorphism g : A — B with the following properties. The sets J \ A and 
J \ B are negligible and o (uw) = o(y(w)) for every u € A. 


Let us prove that the integral does not differ for equivalent M-surfaces. 


Theorem 11.55 [fo and o are two equivalent M-surfaces, then 


[=f 


Proof With the notations introduced previously, sinceo = cog, withhg: A > B, 
we have 


Blu) = (det of, i4)(%) 


1<i, <---<iy<N 


= (act (5%, a iw@CWP CH) a ey 


= (deta, (6) 


E(p(u)) det y’(w). 


det gy’ 
y ote (w) 


1<i) <:+-<iy< 


Therefore, by the Change of Variables Theorem 11.43, since J \ A and J \ B are 
negligible, we have that 


[i = | row) E(w) || du 
- i F(E(y(w))) |E(@(w))|| | det g’(w)| du 
= | Few IZ@av= fs. 


thereby proving the claim. a 


The following theorem is crucial for the treatment of the measure of M-parame- 
trizable M-surfaces. We recall that o : J > RY is an “M-parametrization” of a set 
M if it is regular, injective on 7, ando (1) = -@. 


Theorem 11.56 Two M-parametrizations of the same set are always equivalent. 


362 11. The Integral 


Proof Let @ be the subset of R™ taken into consideration, and let o : J > RY 
and @ : J > RY be two of its M-parametrizations. We define the sets 


A=I1No!(A\ (o(91)UG(aJ))), B=JING'(A\ (o(@NUGE(AI)). 


Then, for every u € A, since a(u) € @ \ (ao (dl) Ua(dJ)) anda (J) = -G@, there 
exists av € J such that a(v) = o(u). Clearly, c(v) € @ \ (a (01) UG(aJ)), so 
that v € B. Moreover, since o is injective on J , there is a unique v in J with such 
a property. We can thus define g : A > B by setting g(uw) = v. Hence, for u € A 
and ve B, 


gQUu=v & o(u)=a(v). 


This function g : A — B is invertible; a symmetrical argument may be used to 
define its inverse p~!: B > A. 

Let us verify that the set A is open. Since o, o are continuous functions and 0/, 
adJ are compact sets, we have that 0 (07) Ua(dJ) is compact, hence closed. Then 
M \ (o(81) U&(8J)) is relatively open in .@, ando~!(.4@ \ (o (1) UG(AJ)) is 
relatively open in /, so that its intersection with / is an open set. In an analogous 
way it can be seen that B is an open set, as well. 

Let us take a Up € a and set Z) = (Uo). The Jacobian matrix o’(v9) has rank 
M, and we may assume without loss of generality that the first M lines are linearly 
independent. Since RY ~ R” x RY—™, we will write every point 2 € R’ in the 
form @ = (a, 22), with a, € R™ and a ¢ RX. However, so as not to have 
double indices in what follows, we will write x9 = (a? eS), 

Let ®: J x RN-” -. RN be defined as 


O(v, z) =a(v) + (0, Zz). 


Then ’(v, 0) is invertible, so that ® is a local diffeomorphism: There are an 
open neighborhood Vo of vo, an open neighborhood Qo of 0 in R—™ and an 
open neighborhood Wo of ao such that ® : Vo x Qo — Wo is a diffeomorphism. 
Moreover, we can assume that Vo C J.Lettvw=o-!: Wo > Vo x Qo. We will 
write V(x) = (Wy) (@), Y2(@)), with V1) (x%) € Vo and Y2(x) € Qo. 

We now prove that ¢ is of class C!. Take wo € A, and set a = o (Uo) and 
Vo = y(uo). Assume U9 as previously, with c’(vo) having the first M lines linearly 
independent, so that the local diffeomorphism YW : Wo — Vo x Qo can be defined. 
Take an open neighborhood Up of uo, contained in A, such that o(Uo) C Wo. Then, 
foru € Up and ve B, 


guy=v @& o(Uy=O(v,0) & (v,0)= Vo(u)). 


Hence, ¢ coincides with Y; oo on the open set Uo, yielding that g is continuously 
differentiable. 


11.16 M-Dimensional Measure 363 


-! : B = A is of class C!, so that o 


In a symmetric way, it is proved that g 
happens to be a diffeomorphism. 
We now prove that the sets J \ A and J \ B are negligible. Let us consider, e.g., 


the second one: 
J\B=aJU(J\ B) =dJU{ve J: E(u) EC a(AlI}Ulve TF: E(w) € E(AJ)}. 


We know that aJ is negligible. Let us prove that {v € J: a(v) € o(d/)} is also 
negligible. 

Let Uo € J be such that a(vo) € o(0/). Then there is a Uo € OJ such that 
o (Uo) = O(Vo). We argue as previously and define V : Wo > Vo x Qo. Let Up be 
an open neighborhood of uo such that o(Ug N J) C Wo. Let us see that 


ING (6(Up NID) € (Wy oa)\(UpN Il). 


Indeed, taking v € In &—!(o(Ug N O1)), we have that 6(v) € o(Up N Al). 
Then, since ®(v, 0) = a(v), we have that V(a(v)) = (v, 0) € Vo x Qo, hence 
v € VWi(o(U09 ON OJ)), and the inclusion is thus proved. Now, since Wj 0 o is of 
class C!, by Lemma 11.38 we have that (Wj o0)(U9 M0) is negligible. Finally, the 
conclusion that {v € J:6 (v) € o(0I)} is negligible follows from the fact that d/ 
is compact, so that it can be covered by a finite number of such open sets as Up. 

It remains to be proved that {uv € ‘f a(v) € o(dJ)} is negligible. Let vp € J 
be such that 6 (vo) € &(0J). Then there is a Uo € aJ such that 6 (v9) = o (v0). Let 
Vo be an open neighborhood of Uo such that &(Vo ASS Wo. As above, one sees 
that INE "(E(VoNAT)) C cM 06)(VoNaJ), showing that ine" (VonaJ)) 
is negligible. The conclusion is obtained as above, covering 0J by a finite number 
of such open sets Vo. a 


If @ is an M-parametrizable set, we can define the integral of f on @ as [ ato 
where o is any M-parametrization of .@. We will denote it by 


[fan of feodunce. 
M M 


If M = N, one reobtains the usual integral, i.e., Sa f(x) dx. 


11.16 M-Dimensional Measure 


Consider the interesting case where f is constantly equal to 1. 
If M = 1, we haveacurveo : [a,b] > R%. The integral 


b 
iste) =} lo" (a) | dt 


is said to be the “length” (or curvilinear measure) of the curve o. 


364 11. The Integral 


Example Leto : [0, b] > R? be defined by o(t) = (ft, Pe 0). Its image is an arc of 
parabola, and its length is given by 


b 
(0) -| 1+ (21) dt 
0 


sinh7! (2b) 1 
/ =(coshu)* du 
sinh-!(0) 2 


inl 
1 E + sinhu coshu ee 


2 a A 


1 
; (sinh! (26) + 2by/1 + 4b?) 

1 b 

zn (26+ v1 + 40? ) + 5V1 +407. 


If M = 2 and N = 3, then we have a surface o : [a,, 1] x [a2, b2] > R?. The 


integral 
bo by 
noy= | | 
a a 


is said to be the “area” (or surface measure) of the surface o. 


do do 
—(u,v) X —(u, v)] dudv 
ou dv 


Example Let o : [0,2] x [0, 27] > R? be defined by 
o(¢,0) = (Rsingcosdé, Rsing sind, Rcos¢). 


Its image is a sphere of radius R, and its area is given by 


2m pm 
(oc) = / i (R2 sin? cos @)2 + (R?2 sin’ ¢ sin 0)? + (R2 sing cos 0)? dg dé 
0 0 


20 WU 
= / if R’ sing dd dé 
0 0 


= 47 R?. 


In general, when the function f is constantly equal to 1, we have the following 
definition. 


11.17. Length and Area 365 


Definition 11.57 We call “M-superficial measure” of an M-surface o : 1 > RY 
the following integral: 


272 
iv(a) = [eeu au= || - (det of. iy) du. 
1<i <:---<iy<N 


As reasonably expected, a direct consequence of Theorems 11.55 and 11.56 is 
the following corollary. 


Corollary 11.58 Two equivalent M-surfaces always have the same M-superficial 
measure. In particular, this is true of any two M-parametrizations of a given set. 


Example Consider the two curves o, o : [0, 27] > R2, defined by 
o(t) = (cos(f), sin(t)) , a(t) = (cos(2r), sin(2r)) . 


Notice that, even if they have the same image, these curves are not equivalent. 
Indeed, as easily seen, 1;(0) = 27 while 1\(o) = 4z. 


The foregoing considerations naturally lead to the following definition. 


Definition 11.59 We call “M-dimensional measure” of an M-parametrizable set 
MM RN the M-superficial measure of any of its M-parametrizations. 


We denote the M-dimensional measure of .W by uy (-@). In cases where M = 
1, 2, the M-dimensional measure of .# is often called the “length” or “area” of .Z, 
respectively. We may thus consider, for example, the length of a circle or the area 
of a sphere. If M = N, it can be verified that the N-dimensional measure of the set 
M is the same as the usual measure, i.e., uy (HW) = U7). 
11.17 Length and Area 


Let us first consider a curve o : [a,b] > RY. For any partition P of the interval 
[a, b] of the type 


ad=do <a, <-++:<dn-1 <an =), 


we compute the length of the polygonal curve joining the points o (a;), i.e., 


&(o, P) = >> |lo(aj) — o(@j-a)I- 


j=l 


366 11. The Integral 


It is rather intuitive that these lengths may be chosen as a good approximation of the 
length of the curve o, provided that the points of the partition are sufficiently close 
to one another. What follows is the precise statement. 

Theorem 11.60 The length of the curve o is obtained as 


ty(o0) = sup {£(o, P) : P partition of [a, b]} : 


Proof For every partition P of [a, b] one has that 
aj aj 
Jota) —o(a;-vl=| foals f to'@nar, 
aj-1 aj-1 


hence 
n n aj b 
@.P)=Yrliea;) ows of ho'@iar= fo’ oar. 
j=l a ae 7 


j-1 


Then also 
b 
sup {&(o, P) : P partition of [a, b]} < / lo’(t) || dt. 
a 


Let us prove now the opposite inequality. Fix e > 0. Since o’ : [a,b] > R* is 
continuous, by the Heine Theorem 4.12, it is uniformly continuous. Hence, there 
exists ad > O such that 


Ils—t]}<6 => |lo(s)-o' Ml <e. 


1,...,m. 
lo'(aj)\l +, 


Let P be a partition of [a, b] such that a; — aj_; < 6 for every j 
If t € [aj-1, a;], then |lo’(t) — o’(a;)|| < e, hence ||o’(t)|| 
implying that 


IA Il 


[Oo voronar sf" deans evar 


j=1 j-1 


aj 
= lla’ (aj) || dt + e(aj — aj-1) 
a 


j-1 


aj 
= [/ o'(aj)at| Soly = aji) 
aj-1 


| oep-eerale] {" roa tae) —ar0 


11.17. Length and Area 367 


< / ; lo’ (aj) — o'(t)|| dt+|lo (aj) — o (aj-1) |+e (aj — aj—1) 


j-1 


< e(aj — aj-1)+|lo (aj) — o(@j-1)||+8(aj — aj-1). 


Therefore, 


m a 


J 
llo’(2)|| de 


< ) (llo(@j) - oa) + «(aj - 4-1) 


j=l 
= L(0, P) + 2e(b — a) 
< sup {e(o, P) : P partition of [a, b]} + 2e(b—a). 


In view of the arbitrariness of ¢, the inequality 


b 
/ lo’ (t)|| dt < sup {£(0, P): P partition of [a, b]} 


must hold true. The statement is thus proved. g 


One could now try to see whether a similar construction can also be made for 
surfaces. Surprisingly enough, we will see that, in general, this is not possible. 

Let us consider the lateral surface of a cylinder with a circular base having radius 
r and height h. We parametrize it in cylindrical coordinates through the function 
o : [0, 27] x [0, h] > R? defined as 


o(0,z) =(rcosé,rsing, z). 
Its area can be easily computed, and it is equal to 
(0) = 2mrh. 


Let us now construct the “Schwarz lantern.” It is a polyhedron having 4mn 
triangular faces inscribed in the considered cylinder. The vertices of this polyhedron 
correspond to the points obtained subdividing the domain into nm subrectangles 


: 2n 20 h ih ee 
G-)—, j—|x|k-D-,k-}], withj=l,....m, k=1,...,n, 
m m non 
and then further dividing each of them, by means of their diagonals, into four equal 


triangles. We will denote by A(m, ) the area of this polyhedron. 


368 11 The Integral 


Using simple geometrical considerations it can be seen that each of the 47mn faces 
of the polyhedron is an isosceles triangle whose base length is equal to 


b = 2r sin (=) ; 


and height length is equal to 


Indeed, as seen in the figure, 
AE =rsin(=), OE =rcos(=), ED =r(1~cos(=)), CD=—. 
m m m 


Hence, the sum of the areas of the 47mn triangles is 


bh sin (#) 


A(m,n) = ars = 2mr 


FL 
m 


11.18 Approximation with Smooth M-Surfaces 369 


Renee mrn\2 
We see in this formula an unpleasant term: ( ) .Ifm — +00 andn > +00, 


2 
m 
it is not guaranteed that it tends to zero, as we would like, so as to have that A(m, n) 
approaches 2zrh. On the contrary, we must admit that 


the limit of A(m, n) for (m,n) > (+00, +00) does not exist! 


Note, for example, that A(m,m) — 2zrh, while A(m, m?) — +oo and, for every 
£ > 2mrh, there exists a sequence ()m such that ny, > +00 and A(m, nm) > &. 


11.18 Approximation with Smooth /-Surfaces 


Leto : J > RY bean M-surface. In the final chapter of the book, we will need the 
following approximation result. 


Proposition 11.61 There exists a sequence of C®-functions on, : RM — RN such 
that 


i) a 
= (u) = <* (wu) , uniformly on I , 
Uj Ou; 


limon(u) =o(u), lim 
n n 


for every j =1,...,M. 


Proof First of all, we can extend o to a C!-function on a rectangle J such that 
I C J, and then we set o(uw) = 0 when u ¢ J. 
Consider the C®-function g : R“ — R defined as 


1 
C exp | —_.—— if ||z|| < 1, 
y(z) = (=) lll (11.1) 
0 if ||z|| = 1, 


where the constant C > 0 is chosen in such a way that 
I g(z)dz=1. 
B(O,1) 
Define o, : RR“ > RY as 


on(ts) = n™ [ _ oyna — 9) dy. 


370 11. The Integral 


We see that each o,, is aC°-function and, by a change of variables, 


On(U) = [,e(u- ~z) (2) dz = [2 “z) (2) dz. 


Hence, 
On(h) — (Ut) = Fows (o(u = ~z) _ o(u)) (2) dz. 


Let ¢ > 0 be fixed. Since o is uniformly continuous on J, there exists a 5 > O such 
that, if w and w’ belong to J and ||w’ — w|| < 4, then ||o(w’) — o (tL) || < &. Now, it 
is surely true that if wu € J and zn is large enough, then wu — +z e€ J and I> zl <6, 
whence 


I|on(u)—o (U)|| < / 


B(0,1) 


|o(u-—2)—o1w) |y@adz <é = o(z)dz =e. 


This proves that lim, 0, = o uniformly on J. 
Now observe that, by the Leibniz Rule, for every 7 = 1,..., M, 


) 0 1 
=- (uw) = / —(u _ —z) (2) dz. 
du; B(O,1) Ou; n 


80n _ 80 nj 
tu; = Buy? uniformly on/. 


Then the same argument as above shows that lim, 


11.19 The Integral on a Compact Manifold 


We now assume that .@ is a compact M-manifold. We will see how it is possible to 
define the integral of a function f on .@. 

By Theorem 10.39, for every x € -@ there is a local parametrization by some 
M-surface o : I —> IR, where J is a rectangle of R™ of the type 


pa] [-w ol ife gd, 
~ | [-a,a]”"! x [0,a] ifveda, 


anda (0) = &. 
In the case where f| M’ the restriction of f to the set .W is equal to zero outside 
the image of a single local M-parametrization, we simply define 


LfaLt 


Let us verify that this is a good definition. 


11.19 The Integral on a Compact Manifold 371 


Proposition 11.62 Ifo : I — RN ando : J > RN are two local M-parametriza- 
tions and f | a equal to zero outside the image of o and outside the image of 0, 


then Je f= fe f. 
Proof If this is the case, we set 

M=o0DNG), Ta0'@, Fae MM. 
Following closely the proof of Theorem 11.56, we define the open sets 
A=INo|(A\(e(8IUG(AJ))), = B= TNE "(A\ (o(AN|UG(AI))). 


For every u € A there is a unique v € J such that o(v) = o(u), and we can 
then define g : A —> B by setting g(u) = v. It can be seen thatg : A > Bisa 
diffeomorphism such that o(w) = o(g(w)) for every u € A. Proceeding as in the 
proof of Theorem 11.55, since f law vanishes outside o (J) No (J), we have that 


[sr = | Focwizcwiau 
= i, f(E(~(u)))||Z((w))|| | det g’ (a) | du 


= | ree@piseav= fr. 
B o 
thereby completing the proof. | 


In general, we saw in Theorem 10.39 that for every x € .@ there is a 
neighborhood A’ of x such that A’ .@ can be M-parametrized by some function 
o : I — RN. For every such = there is an open ball B(x, pz) contained in A’. We 
thus have an open covering of .W@ made of these open balls. Since .@ is compact, by 
Theorem 4.9 there exists a finite open subcovering, which we denote by Aj,..., An. 
Hence, the open set V = A; U---U A, contains .@. We now need the following 
result. 


Theorem 11.63 There exist some functions $1,...,¢n : V > R, of class C™, 
such that, for every x € V and every k ¢€ {1,...,n}, the following properties 
hold: 


(a) O< bk(@) <1, 
(b) © An > (LH) =0, 
(c) pe Ge (@) = 1 for every x € M1. 


The functions $1, ...,¢n are said to be a “partition of unity” associated with the 
open covering Aj,..., An. 


372 11. The Integral 


Proof Let Ay = B(ax, Pa,), with k = 1,...,n. Consider the C°-function f : 
R —> R defined by 


exp (4) if |u| <1, 
0 if |u| = 1, 


f= | (11.2) 


and set 
Vea) = f (a - =l/) , 
ek 


Then, for every @ € V, we have Wj (x@) + --- + Wy(a@) > 0, and we can define the 
functions 


We(a) 
(2) = ——_—————_____. 
Wi(@) +--+ + Wn(@) 
The required properties are now easily verified. | 


Since each dx - f La vanishes outside the image of a single local M-parametriza- 
tion, we can define the integral of f on @ by 


[t-DL oF 


Let us verify that this definition depends neither on the choice of the local M- 
parametrizations nor on the particular partition of unity. 


Proposition 11.64 /f Aj,...,Am is any other finite open covering of M 
and $,...,$m is an associated partition of unity, then Via la bf = 


Vial a bif. 


Proof Indeed, 


11.19 The Integral on a Compact Manifold 373 


eh (Da)ois= Df a bif 


which is what we had to verify. a 


It is now possible to define the M-dimensional measure of an M-manifold ./@ as 
the integral over .@ of the constant function 1. We can use the notation 


pnd) = | 1 
M 


since this quantity coincides with the already defined M-dimensional measure 
introduced when .@ is an M-parametrizable set. In the cases M = | or 2, the M- 
dimensional measure of ./@ is often called the “length” or “area” of @, respectively. 

If M = N, thenit can be verified that the N-dimensional measure of the manifold 
MM coincides with the usual measure, i.e., un(W) = U(D). 


® 


Check for 
updates 


Differential Forms 1 y] 


Let us start considering the projections in R%; as we have already seen, these are 
the functions p» : RY — R defined by 


Pm(X1, X2,.-+,XN) =Xm- 


However, it will now be useful to use a different notation; instead of pm», we will 
write dx. In this way, the known formula for the differential, 


N af 
df (@o)(h) = 5 > (a0) hn , 
OXm 
m=1 
where h = (hy, ho, ..., hm) is any vector in IR, can be written in the form 
N af 
df (ao)(h) = >> Fg (20) 4xm (A) 
m=1 
or, more succinctly, 
N 
ri) 
af eo) = Yo) drm. 
Xm 


m=1 


Here we have a first example of what will be called a “differential form.” 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 375 
A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978- 3-03 1-23713-3_12 


376 12 Differential Forms 


12.1 AnInformal Definition 
Let us introduce an operation between these symbols dx,,, which looks like a 
“product”; we will denote it by the symbol A. Without entering into its precise 
definition (which will be provided in Sect. 12.14), we will simply explain its main 
properties. 

The crucial feature of this operation is that it is antisymmetric, i.e., 


dx; A dxj = —dx; A dx;. 


We can also multiply these objects several times, maintaining the rule that when two 
of them are interchanged, there is a change in sign, e.g., 


dxj, \...dx;...dxj... NdXiy = —dxj, A ...dxj...dxj...N\ AXiy. 
Note that if two indices happen to be the same, then 
dxi, \...dxj...dxj... N\dXxiy = 0. 
This is why we will usually consider the indices in a strictly increasing order, i.e., 
dxi, \---Adxiy, with l <i) <---<im<QN. 

All the other products of M elements are either equal to 0 or can be reduced to this 
form after a reordering of the indices, possibly leading to a change of sign. 

Let us analyze, for example, the case N = 3. We have here dx), dx2, and dx3. If 
we take two of them with the indices in strictly increasing order, we obtain 

dx; \dx2, dx; Adx3, dx2Adx3. 
In the other cases, we have 
dxyANdx, = —dx1Adx2, dx3Adx, = —dxjAdx3, dx3Adx2 = —dx2Adx3, 
while 
dx, A dx, = dx2 A dx2 = dx3 AN dx3=0. 


Moreover, there is a unique product of three elements with strictly increasing 
indices: 


dx; A dx2 A dx3. 


12.1. An Informal Definition 377 


Concerning the other products of three elements, we have 
dx2 \ dx3 N dx, = dx3 AN dx Adx2 = dx, A dx2 \ dx3 
and 
dx, A dx3 A dx2 = dx3 AN dx2 A dx, = dx2 A dx, A dx3 = —dx, A dx2 A dx3, 
while all those having two or three coinciding indices are equal to 0. 
Let O be an open subset of R’ and M a positive integer. We will call “differential 


form of degree M” (or “M-differential form”) a function defined on O by an 
expression of the type 


o@)= > firiy(@) dx, A+++ A dXin - 


1<i, <--<iy<N 


The functions fj,..i, : O > R are the “components” of w. We will say that w is of 
class C* if all its components are of that class. The set of all M-differential forms 
of class C* defined on the subset O of R% will be denoted by 

Fi (O, RY). 
Ifk = 0 (.e., when the components of the differential forms are continuous), we will 


simply write Fy(O, R”). Note that w(a) is determined by the (j,)-dimensional 
vector 


F(@) = (fi...iy (@)) I<ij<--<iy<N * 
We will call 0-differential form any function defined on O, with values in R. Hence, 
FE(O,RY) =ck(O, RY), 


Let us take a closer look at the case N = 3. Denoting by wy an M-differential 
form, with M = 1, 2,3, we can write 


@\(@) = fi(@) dx, + fo(ax) dx2 + f3(x) dx3, 
w2(x) = fi2(a@) dx; A dx2 + fi3(@) dx, A dx3 + f23(a) dx2 A dx3, 


w3(@) = fi23(@) dx, Adx2 A dx3. 
Notice that w (x) and w2(x) are determined by the three-dimensional vectors 


F(a) = (f(x), fo(w), fa(w)) and = F(@) = (fio(@), fis(@), fos(@)), 


respectively, while w3(a) is determined by a single function f123(2). 


378 12 Differential Forms 


Henceforth, a function like F : O > R, withO C R%, will be called a “vector 
field.” 


12.2 Algebraic Operations 
To simplify the notation, we will sometimes write 
dxj, \+++ A dXiy = AXi,,....iy - 


Hence, the previously defined differential form will also be written as follows: 


o(2) = a Fit .ciy (©) AXi,,...,iy - 


1<i) <:--<iy<N 


It is possible to define the sum of two M-differential forms: If @ and @ are both 
defined on O, writing 


O(x) = ye Biy,..niy (2) AXi, iy > 


1<i <---<iy<N 


we define wm + @ in a natural way as 


(+ 3)@)= So (Sins @) + Bin, nig (@)) EH, iy - 


1<i <:--<iy<N 


Moreover, if c € R, then we define cw, the product of the scalar c by the M- 
differential form w, as 


COGS iste daa ey: 


1<i, <:--<iy<N 


With these definitions, it can be checked that the set F a (O, RY ) is a real vector 
space. 
Given two differential forms w € Fy (O, RY), & € ae (O, RY), of degrees M 


and M , respectively, we now define the differential form w A @, of degree M+ M ; 
which is called the “exterior product” of w and @. If 


w(x) = > Fits siy (2) AXi,, iy 


1<i, <---<iy<N 


and 


BOS NE Seine VOL ies 


I<ji<--<jy<N 


12.2 Algebraic Operations 379 


then we set 


(@ A @)(L) = » Fit, rim (D8 jr,...5 (©) WMi,..nimoinesig ° 
1<ij <:-<iy<N 
I<ji<-<jgsN 


Usually the symbol A is omitted when one of the two is a 0-differential form, since 
the exterior product is, in this case, similar to the product with a scalar. Notice that, 
in the preceding sum, all elements with a repeated index will be zero. Here are some 
properties of the exterior product. 


Proposition 12.1 If w, 0, @ are three differential forms of degrees M, M, M, 
respectively, then 


ifc € R, then 

(co) \®=OA (CO) =C(O@ABG); 
moreover, when M =M, 

(0 +0) \@= (WA) + (AO), 

OA (@+0) =(®A0)+@AB). 
Proof Assume that @ and @ are written as above, and let 


O(a) = > Pky ..sk = (L) AXky,....k 


1l<k) <:-<kz<N 
M 


The first identity is obtained observing that, in order to arrive from the sequence of 
indices ij,...,im, fi,-.-. jg at fl,---. Jy i,-.-,im, we must first move j; to 
the left making M exchanges, then do the same for j2, and so on, until we reach 
Jjq- In the end, it is then necessary to perform M M exchanges of indices. Taking 
into account the fact that the differential form changes sign each time there is an 
exchange of indices, we have the formula we wanted to prove. 

The proof of the second identity (associative property) shows no great difficulties, 
and also the identities where the constant c appears can be easily verified. 


380 12 Differential Forms 


Concerning the distributive property, when M =M , we have 


(w+ 0) A@)(a@) = 


= > (fiy, niu (2) + Bi, iy (OAR... k= (L) AKI, |. iy ky ,.k = 
1<i) <-<iy<N M M 
isk) <-<k os 


ll 
> 
= 
8 
>= 
a 
= 
=n 

8 
a 


1<i) <:-<iy < 
Iski<--<ka<N 


i1,..im (L)Niy,..., kg (@)) dxi, nan IM Kt osk 


(oA @) +(®@AO))(2). 


The last identity is proved either in an analogous way or by using the first and fourth 
identities. a 


12.3. The Exterior Differential 


Given an M-differential form @ of class C!, we want to define the differential form 
dexw, of degree M + 1, which is said to be the “exterior differential” of w. 

If w is a O-differential form, a = f : O > R, its exterior differential dz, w(x) 
is simply the usual differential 


N 


of 
df (x) = ZL) dx. 
f (a) Dae )dxm 
m=1 
In the general case, if 
AES So Ring) agen A diay 
1<i, <---<iy<N 
we set 
dexo(@)= > fi,...iy (@) Adxj, A+++ A dXiy 


1<i) <:--<iy<N 


or, equivalently, 


In what follows, to simplify the notation, we will always write dw instead of dexw. 
Let us consider some properties of the exterior differential. 


12.3 The Exterior Differential 381 


Proposition 12.2 If w and & are two differential forms of class C', of degrees M 
and M, respectively, then 


d(w AG) =dwA6+(-1l)“aArda; 
if M =M andc é€ R, then we have 
d(o+)=dw+do, d(cw) =cda; 
if w is of class C*, then 
d(dw) =0. 


Proof Concerning the first identity, if @ and @ are as above, we have 


N 
~ 0 
d(@ A o)(£) = 1S ai ae im Sit ig)(®) AX, i1,.niM oiled 
m 


1<i, <---<iy<N 
I<ji<--<jgsN ™= 


N 
= fi, vitae iM. , 
= ve ye Sao eee 
1<ij <:-<iy<N OXm 
1<ji<--<jg<N ™=l 


Ofiy,...i 
= 2 y (en ded ing ) (8) AXm,iq,.im ites + 
m 


1<i, <:- *<im<N 
I<ji<-<jy <N m= 1 


N 


98 j 
ae a » Y (ih As a] (©) AXi,.. imam jie iiy 
Xm 


1<i,<- “<imM <N 
I<ji<-<jgsN "= 1 


= (dw A &)(a@) + (—1)"(w A d&)(x) . 


The second and third identities follow easily from the linearity of the derivative. 
Concerning the last identity, we can see that 


N 
d(do(@)= YY — () Xb, ths 


382 12 Differential Forms 
Since, by Schwarz’s Theorem 10.9, 


a Ofi, peeey im a Ofin,...iv 


Ox~ OXm OXm Ox 
taking into account the fact that dx, A dxm = —dxm A dx x, it is seen that all the 
terms in the sums pairwise eliminate one another, so that d(dw)(x) = 0. | 


12.4 Differential Forms in R° 


In the special case N = 3, in view of the applications of the theory of differential 
forms to some important physical situations, we prefer adopting a different order in 
the components of a 2-differential form. Instead of the increasing ordering of the 
indices that has been adopted so far, 


dx, \dx2, dx; Adx3, dx2Adx3, 
henceforth we will prefer to take 
dx2 \dx3, dx3Adx,;, dx, Adx2. 
Let us then investigate the operation of the exterior product in this case, with the 


newly adopted convention. 
If w, and @ are two 1-differential forms, e.g., 


w(x) = fi(@) dx, + fo(x) dx + f3(a@) dx3, 
O(a) = fi(@) dx, + fo(w) dx. + fa(w) dx3, 


we compute 
(@ A @1)(@) = (fo(@) fa(x) — f(x) fa(@)) dx2 A dx3 + 


+(f3(a@) fi (@) — fila) fa(@e)) dx3 A dxy + 
+(fi(@) fo(@) — fol) fi(@)) dx; A dx. 


Hence, if F = (fi, fo, f3) and Fe ( fi, fr, fs) are the vector fields corresponding 
to w, and @1, respectively, we see that the vector field determined by a, A @, is 


FxF=(hf-hh, hiA-fih, Ah-Ah, 


the “cross product” of F and F. 


12.4 Differential Forms in R? 383 


On the other hand, if @; is a 1-differential form and @> is a 2-differential form, 
e.g., 


axe) = fi(@)dx1 + fola)dx2 + fa(x)dx3, 
2 (x) = f(a) dx2 A dx34+ fo(a) dx3 A dx, + fj(@) dx; Adx2, 


we have 
(w1 A @2)(@) = (fi(@) fi(@) + fo(@) f(x) + fa(@) fa(@)) dx A dxz A dx3. 


Hence, if F = (fi, fo, fa) and F= ( fis fr, fs) are the vector fields corresponding 
to w, and @2, respectively, we see that the vector field determined by @ A @» is 


Pe? fife p+ hE 


the “scalar product” of F and F. 
If we have a 0-differential form w) = f : O — R, then 


dwo(z) = Ff (ae) dx, + oF wy dx2 + OF dx3, 
Oxy 0x2 Ox3 


and we know that the corresponding vector field is the “gradient” of f, 


ee), 


Ox, 0x2" 0x3 
Taking a 1-differential form 
o(@) = fi(@)dx1 + fr(@) dx2 + f3(@) dx3, 


we compute 


da|(x) = (4a _ 22 a)) dx2 A dx3+ 
+ (Aw _ a4 @)) dx3 Ndx, + 
0X3 0x1 


+ (4@ _ aH a)) dx, Adx2. 
Ox] 0x2 


If F = (fi, fo, #3) is the vector field associated with w), we call “curl” of F the 
vector field corresponding to dw), and we write 


ae (eee ea a 


0x2 4x3 9x3 ax, > Oxy 0x2 


384 12 Differential Forms 
Finally, if we consider a 2-differential form 
w2(@) = fi(@) dx2 A dx3 + fo(@) dx3 A dx + f3(@) dx1 Adx2, 


then 


dw2(x) = Lives + 9P2 (ap) + ere dx, A dx2 A dx3. 
Ox] 0x2 0x3 


If F = (fi, fo, fs) is the vector field associated with w2, we call “divergence” of F 
the scalar function corresponding to dw2, and we write 


_ fi , af , ahs 


V:-F= : 
Ox] 0x2 0X3 


We will explain later on the physical meaning of curl and divergence. 

The properties of the exterior product and those of the exterior differential lead to 
formulas involving the gradient, the curl, and the divergence. Taking f : O > R, 
f:O7R, F:O0- R?, and F : O > R? we have the following formulas: 

Vx (Vf) =0, 

V-(Vx F)=0, 

VFA =FVN+IVA), 

Vx (fF) = (Vf) x F+ f(V x F), 
V-(fFI=(VS)-F+SV-F), 

V-(F x F)=(Vx F)-F-F-(VxF). 


The proofs are left to the reader. 


12.5 The Integral on an M-Surface 


We want to define the notion of integral of an M-differential form w € F(O, RY) 
on an M-surface o : J] > R%, with 1 < M < N. Let 


o@)= Y  finniy (@) Axi A+++ A dXiy - 


1l<ij<-:<iy<N 


12.5 The lntegral on an M-Surface 385 


Recall, for every x € O and every u € J, the tA) -dimensional vectors 


F(a) = (ae @)) ive audigen ’ 


E(u) = (det oly... iy) 


1<i, <:+-<iy<N 


If o(J) C O, then, denoting by “-” the Euclidean scalar product in RW), we set 


[oa [row zaydu. 
o I 


Let us consider the meaning of the given definition in two special cases. 
If M =1, theno: [a,b] > IR” is acurve and w is the 1-differential form 


o(@) = fi(@)dx1 +--+ + fy(@)dxy . 


Then we have F(a) = (fi(@),..., fw (x)) and 


b 
fo=] F(a(t))-o'(t) dt. 


This quantity will be called the “line integral” of the vector field F = (f1,..., fn) 
along the curve o, and will be denoted by 


[rea 
oO 


In mechanics, this concept is used, for example, to define the “work” done by a field 
of forces on a particle moving along a curve. 


Example Let us compute the line integral of the vector field F(x, y,z) = 
(-y,x, 7) along the curve o : [0,27] > R3, defined by o(t) = (cosf, sint, ft): 


20 
[Fae =f (— sint, cost, t*) - (— sint, cost, 1) dt 
oO 0 
20 3 
8 
=) (sin? 1 + cos? 1 +17) dt = 20 +. 
0 


If M =2andN = 3,theno : [aj,b,] x [ao, bo] > R? is a surface and w is 
the 2-differential form 


w(a@) = fi (a@) dx2 A dx34+ fr(x@) dx3 A dx, + f3(@) dx, A dx2. 


386 12 Differential Forms 
Then we have F(a) = (f(x), f2(x), f3(a@)), and 


ee) 2. (u, v5 002 2(u, v) 


bo by 
fo =f / filo (u, v)) det + 
o a a 


a) 3 (u,v) 5 905, 3.(u, v) 


aoa 3 (u, v”) 5 003 3 (u, v) 
+ fo(o(u, v)) det + 
1 (y,v) Sh (u, v) 


a (u,v) 5 ulus 1 (u,v) 
+ f3(o (u, v)) det dudv 
a “2 (ut, vu 5 002 “2 (ut, v) 


boy phy a 
= / / ROGAN GK” Gaia 
a2 a, ou Ov 


This quantity is called the “surface integral” or “flux” of the vector field F = 
(fi, f2, f3) through the surface o and will be denoted by 


[reas. 


In fluid dynamics, this concept is used, for instance, to define the amount of fluid 
crossing a given surface in the unit time. 


Example Let us compute the flux of the vector field F(x, y,z) = (—y,x, 27) 
through the surface o : [0, 1] x [0, 1] > R?, defined by o(u,v) = (u2,v,u +0): 


1 1 
[Fas =f / (—v, u7, (u + v)*) + (1, —2u, 2u) du dv 
oO 0 0 


1 pl 
= i / (v — 2u? + 2u* + 6u>v + 6u7v" + 2uv) dudv = a, 
It is important to analyze how the integral of a differential form w changes on two 
equivalent M-surfaces. We recall that o : J > R% and& : J > R% are equivalent 
if there exists a diffeomorphism gy : A — B such that o(w) = o(g(w)) for every 
u € A, and the sets J \ A, J \ B are negligible. 


Definition 12.3. We say that the equivalent M-surfaces o and o have the “same 
orientation” if det y’(w) > 0 for every u € A; they have the “opposite orientation” 
if det g’(w) < 0 for every u € A. 


12.5 The lntegral on an M-Surface 387 


We provide some examples. 


Example 1 Given a curve o : [a,b] > RY, an equivalent curve with the opposite 
orientation is, for example, o : [a, b] > RY defined by 


o(t)=a(a+b-t). 


Example 2 If o is regular, an interesting example of an equivalent curve with the 
same orientation is obtained by considering the function 


t 
p(t) =| llo’(c)|| dz. 


Since g'(t) = |lo’(t)|| > 0, for every t € Ja, b[, setting 4, = y(b), we have that 
yg : [a,b] — [0,11] is bijective and the curve o; : [0,413] > IR, defined as 
o\(s) = o(y~!(s)), is equivalent to o. Notice that, for every s € ]0,11[, we have 
that 


loi (s)I] = Ilo’ "(s))(@~!)'(s)I 


oa (pg |(s)) 


1 
gy’ (g-\(s)) | 


a’ (pg |(s)) 


es ae | = 
lo"@Ty)I 


Example 3 Given a surface o : [a1, bi] x [a2, b2] > R3 an equivalent surface with 
the opposite orientation is, for example, o : [a1, bi] x [a2, b2] > R? defined by 


a(u,v) =a(u, a2 + bz — v) 
or by 
ao(u,v) =a(a,; +b, —u,v). 


Theorem 12.4 Leto : I > RY anda : J > RN be two equivalent M-surfaces. 
If they have the same orientation, then 


fo- fo 


if they have opposite orientations, then 


fo=-fo 


388 12 Differential Forms 
Proof We have an M-differential form of the type 


o@)= > finniy (@) dx Av++ A dXiy - 


1<i, <---<iy<N 


Let gy : A > B beas in the definition of equivalent M-surfaces, such that o = cog. 
By the Change of Variables Theorem 11.43, we have 


[o = » [a ee im (F(U)) det aij, jyy(u) du 


eeiey 


ll 
ee 
=A 
= 
= 
Qe 
tae 
aS) 
_ 
= 
LS 
Q 
a 
Q 
° 
= 
= 
is 
~— 
aX 
im 


= » [ Fironiy (E((U))) det S{,, i, )(@(ts)) det yp’ (u) du 


1<i, <---<iy<N 


=+ » [in es GW) dedi, odv=+ fo, 


<i <:--<iy<N & 
with positive sign if dety’ > 0, negative if det gy’ < 0. a 


Remark 12.5 In general, if o and o are equivalent, we do not necessarily have the 
equality | 7 Ol = | I. ; @|. Indeed, it is not guaranteed that they have the same or 
opposite orientations. For example, if we consider the two surfaces o,o : [1, 2] x 
[0, 277] + R?, defined by 


(5 + (u = >) cos 7) sin v, (u _ >) sin >). 
o(u,v) = o(u, u+ =). 


it is possible to see that they are both parametrizations of the same set (a Mobius 
strip), and therefore they are equivalent (the reader is invited to explicitly find a 
diffeomorphism g : A — B with the properties required by the definition). On the 
other hand, if we consider the 2-differential form w(x1, x2, x3) = dx 2, determined 
by the constant vector field (0, 0, 1), then computation yields 


[o=o. [o=-i. 


We now consider the important case where M = N. 


12.6 Pull-Back Transformation 389 


Theorem 12.6 Let M = N; ifo is regular and injective on I with deto’ > 0, and 
w is of the type 


w(x) = f(@) dx, A---Adxy, 


[o= ie 
oO o(1) 


Proof By Corollary 10.25, we see that o induces a diffeomorphism between ii 
and o(/). Since both the boundary of J and its image through o are negligible 
(Lemma 11.38), by the Change of Variables Theorem 11.43, we have 


then 


[emf rompaeio'uyau 
oO I 


= [ rocw)aevo'(uy) au 
I 


7 [, a La f 


This completes the proof. | 


If o is the identity function, then o(/) = J, and instead of f g ©» one usually 
writes :; 10: Hence, we have that 


[ fas n-rasy = ff. 
I I 


12.6 Pull-Back Transformation 


Consider again an M-differential form w € Fy (O, RY), 


Oa Yo fie @dmy Aor Adm: 


1<i, <---<iy<N 


Let 6: V > O be a function of class C!, where V is an open subset of some R?. 
We can then write 


O(y) = (Pi (y),---, On(Y)) ; 


390 12 Differential Forms 


with y = (y1,.-.,yp) € Y, and define a new M-differential form Tgw € 
Fu (V, R®) as 


TOY = YD fin ni PCY) Abi, (Y) A+++ A doin Y) 


1Si, <:+-<iy<N 


it is called a “pull-back transformation” through @ of w. Notice that 


P 
l 0 im 
Abi (Y) A-++A dbiy(Y) = os we ppdy;) A» “A (> - (wav) 
J 


pues! j=l 
P 
Ogi 986i 
Darn A Avia: 
fens imal OOF 


The following three properties are readily verified. 
Proposition 12.7 For any constant c € R, we have 
To(cw) = cTgo. 
Proposition 12.8 If @ is an M -differential form defined on O, then 
Ty (@ A ©) = Tyw A Tg. 
Proposition 12.9 [f moreover, M =M, then 
Ty(@+ ©) = Tyo + Ty. 
Let us now prove some additional properties. 
Proposition 12.10 [fw :W — Vand¢: VY — O, then 
Ty (Tg@) = Tgoyo . 


Proof By the preceding linearity properties, it will be sufficient to consider the case 
of a differential form of the type 


w(x) = fi Me 2 iy (2) dxi, A+++ A dXiy - 


Then 
“agi 06; 
aed =| iat 22) a on. S| wdyj, A A dWiy - 
jive im=l dy, OY ju 


12.6 Pull-Back Transformation 391 


On the other hand, 
Toow® = (fiy,....im 99 9 WAG OW), A+ AAGOW) iy 


and since 


P 


d(boW)ix =d(Gi,0°¥) = >> ( 


j=l 


equality then holds. | 


Proposition 12.11 Assume that is of class C”. If w is of class C', then Tg is too, 
and 


d(Tpw) = Tg (do) . 


Proof Here, too, it is sufficient to consider the case w = fi, iy AXi, A+++ AdXiy - 
We have 


d(Tp) = d(fin,....im 00) Addi, A+++ A Abin 
+ (fir,....iu 09) U(dbi, A+++ A dbiy) 
= d(fiy,...im 09) A dbi, A+++ Addin 


N 
= bs (#09) “| A di, A+++ \ dbiy - 
OXm 


m=1 
On the other hand, 
dw(x) = > Brill 9) dig Adri, Ao A AX: 
m=1 aan ‘ ° a 
hence 


(do) = )° of) dom Addi, A--- A diy » 


and the formula is thus proved. a 


Proposition 12.12 Ifo : I — RN is an M-surface whose image is contained in 


O, then 
[oa [te 
oO I 


392 12 Differential Forms 


Proof As previously, we just consider the case w = fi,,...,iy Xi, A+++ A dXiy . We 
have 


00; 00; 
[roa fn was im(o(u)) D> —*(w)... —“(u) duj, A+++ A dujy 
I I 5 Fated OU jy 


arer 


This completes the proof. a 


12.7. Oriented Boundary of a Rectangle 


Assume that 0; : 1) > RY, ..., 0, : I, > R% are some M-surfaces. We call 
“gluing” of o1,..., 0, the n-tuple 


(O1,.--, On). 
Notice that the elements inside the n-tuple need not be necessarily distinct: We could 
have that oj = o; for some indices i # j. 


We define the integral of an M-differential form w on the aforementioned gluing 
by setting 


Sexe 


We will now use this to define the integral on the oriented boundary of a rectangle 
I of R“*1, with M > Lie. 


I= [a,b] x --- x lam4i, bu +i) - 


We denote by J; the rectangle of R™ obtained from J by the suppression of the kth 
component, i.e., 


Tk = [ay, bi] x +++ X [ae—1, De-1] X Lagi, beqil x +++ X [amsi, bu 41). 


Consider, for every k, the M-surfaces a, Br an (od R™+! defined by 


Oy (U1, ..., Wk, M41) = (U1, Ue-1, Ok, UL, «+ UML), 


By 1, ..., hk, -) UM41) = (U1, U1, Dk, Ue, -- UMLL) 


12.7 Oriented Boundary of a Rectangle 393 


where the meaning of the symbol ~ is to “suppress the underlying variable.” 
Consider, moreover, some M-surfaces a, ,B, : Ik > RVs equivalent to 
Os By , respectively, with opposite orientations. 


Definition 12.13 We call “oriented boundary” of the rectangle J and denote by d/ 
a gluing of the following M-surfaces: 


(i) a, , and B, if k is odd. 
(ii) af and B, if k is even. 


Hence, d/ is the (2M + 2)-tuple given by 


a1 = (a7, Bt ,ot, By,...,04.,,By.1) if Mis odd, 


OL (0 Pi Oa whe tO yaa Py) AL is evens 


If w is an M-differential form defined on a subset O of R™*! containing the 
image of 0/7, we will then have 


M+1 M+1 


o= ot [ w+ cet f w 
i dX at dX Bt 


k 


M+1 


= yevet(f o-| w) : 
k=l Be oe 
Let M = 1, and consider the rectangle [a1, bi] x [a2, b2]. Then 
aI = (a7, Bt, at, By), 
where, for example, 


a; :[a2,b2] > R?, ve (a,a+bh2-0), 
BY :[an, bo] > R*, vb (61,0), 


at :[a,b]>R, ure a), 


by [a,b > R’, ub (ai tbi—u,bo). 


We can visualize geometrically 07 as the gluing of the sides of the rectangle 
TI oriented in such a way that the perimeter be described in counter-clockwise 
direction. 


394 12 Differential Forms 


If M = 2, we have that 


af = (qj Pe] 0B » Az Bs) 
where, for example, 
oy : [a2, bz] x [43,63] > R®, (v,w) > (a1,an+b2—,w), 


By : [a2, bol x [a3,b3] > B®, (v,w) & (bj, 0, w), 


a} : [a1,b,] x [a3,b3] > B®, (uw) (u, az, w), 
By : fai, 611 x [a3,b3] > R?, (,w) lu, bo, a3 +53 —w), 


a; :[a1,b] x laz,b2.] > R®, (u,v) (a1 +b, —u, 0,43), 


By : (a1, bi] x (a2, bo] > R*, (u,v) + (u,v, 3). 
In this case, we can visualize d/J as the gluing of the six faces of the parallelepiped 


TI, each oriented in such a way that the normal unit vector will always be directed 
toward the exterior. 


Oy 


B,* —E 


12.8 Gauss Formula 395 


12.8 Gauss Formula 


In this section, J will be a rectangle in RY, with N > 2. In the following theorem, 
the elegant Gauss formula is obtained. 


Theorem 12.14 (Gauss Theorem) /f w is a (N — 1)-differential form of class C! 
defined on an open set containing the rectangle I in RN, then 


[aoe | wo. 
I ol 


Proof We can write w as 


N 
w(@) = > Fj(w) dx, A--- A dxj A+++ A dxn. 
j=l 


Then, 


OF; _— 
dw(x) = >> os (a) dxm A dx A+++ A dxj ++» A dxn 


m 
~ j-1 OF; 
= Yew! — (a) dx} A+++ Adxy. 
Ox; 


Since the partial derivatives of each Fj are continuous, they are integrable on the 
rectangle 7, and we can use Fubini’s Reduction Theorem 11.34 to obtain 


td . aF; 
do = yi | L@ydey...dey 
| 2 1 OX; 


pie °) OF; . a= 
=> 1) ‘ ae (x1,...,xN) dx; )dx,...dx;...dxy 
j=l re 


i j 


N 
= re f yawn iD id, sd pe) 
qT; 


J 


a 
ll 
fan 


i ee eee 2 dx,...dxj...dxn , 


396 12 Differential Forms 


by the Fundamental Theorem. On the other hand, we have 


N 
/ o= ) i Fj dx, A---ANdxj A-++Adxn 
at : at , 
k y=) k 


=) Fy(X1, -- 5 Xk=1, Uk, Xk41,---,XN)dx1...dxz...dxn , 
k 


since 


ey - [Oo nye ek, 
GawdN) [1 iffj=k. 


Similarly, 
[o=f Fy (x1, «0+ Xk-1, Ok, Xe41, 5 XN) dX] ...dxK...dxN , 
By Tk 


so that 


N 
o= cof o- | 2) 
iF d pt at 


N 
= vent f [Fa (x1, 06-5 Xk-1, Dk, Xk41, 6. + XN) — 
k=l Tk 


— F(X, Xk=15 Us Xk, +++ XN) dx, ...dx...dxN , 
and the proof is completed. a 


Let us now analyze the particular cases N = 2 and N = 3. 
If N = 2, then we have a 1|-differential form 


o(@) = fi(@)dx1 + fr(@)dx2, 


with the associated vector field F(a”) = (f(a), fo(a@)). The Gauss formula reads 


as follows: 
0 ) 
/ (= - a ax A dx. = / fidxi + fodx2 , 
I ol 


Ox] 0x2 


12.9 Oriented Boundary of an M-Surface 397 


i.e., equivalently, 


[(2-4)-[ Fede. 
1\Ox1 = 0x2 al 
If N = 3, then we have a 2-differential form 


o(a@) = fi(@) dx2 A dx3 4+ fro(x@) dx3 A dx, + f3(a@) dx, Adx2, 


with the associated vector field F(a) = (fi(@), fo(x), f3(a@)). The Gauss formula 
reads as follows: 


r) 0 ) 
/ of, of2 , fs dx, AN dx2 A dx3 
1 \ 0X1 0x2) 0X3 


= / fidx2 A dx3+ frodx3 \ dx, + f3dx, A dx2, 
ol 


which is equivalent to 


[v-r={ F-d&. 
I al 


12.9 Oriented Boundary of an M-Surface 
In this section, 7 will be a rectangle in R™+! ando : I > RN an (M + 1)-surface. 


Definition 12.15 For 1 < M < N —1, we call “oriented boundary” of o and 
denote by do a gluing of the following M-surfaces: 


(i) o oa ando o Bi if k is odd. 
(ii) o oat ando o B, ifk is even. 


Hence, do is the (2M + 2)-tuple given by 


Jo=(0 0a, ,0 08) FOR, 10S Prise OO 70S Bi) if M is odd, 


do=(o oa, ,o OB} (B ORs 6C Oy inks OOjp 760 O By s)) if M is even. 


398 12 Differential Forms 


Given an M-differential form w whose domain contains the image of do, we will 
then have 


M+1 M+1 
/ os yt f wre yee f o 
do k=l count k=l ooBe 
M+1 


= evry o- | 0). 
ra oopt ooat 


Remark 12.16 It is useful to extend the meaning of ae q@ to the case where o : 
[a, b] > RN is acurve, with N > 1, andw = f :O-— Risa 0-differential form; 
in this case, we set 


/ w= f(o(b)) — fle). 


Examples As an illustration, consider as usual the case N = 3. We begin with three 
examples of oriented boundaries of surfaces, where 


do =(c oa, ,o of} ,00a;,008,) : 
1. Leto : [r, R] x [0, 27] > R3, withO <r < R, be given by 
o(u, v) = (ucosv, usin v, 0). 


Its image is a disk if r = 0, an annulus if r > 0. The oriented boundary do is given 
by a gluing of the following four curves: 

0 oa, (v) = (rcosv, —rsinv, 0), 

oa 0 By (v) = (Reosv, R sinv, 0), 

aod, (u) = lu,0,0), 

a 0B, (u) = (r+ R—u,0,0). 


The first curve has as its image a circle with radius r, which degenerates into 
the origin in the case where r = 0. The second has as its image a circle with 
radius R. Notice, however, that these two circles are described by the two curves in 
opposite directions. The last two curves are equivalent to each other, with opposite 
orientations. 


12.9 Oriented Boundary of an M-Surface 399 


Consider, for example, the vector field F(x, y, z) = (—y, x, xye*). Then 


/ Fede= [ Fede | F-de 
do cou, oop 


uf 
20 20 
= / (=r? sin* v — r* cos” v]dvu+ i, [R? sin? v + R? cos” v] dv 
0) 0 


= 2n(R* — 1’). 


2. Consider the surface o : [r, R] x [0,27] > R3 withO <r < R, defined by 


o(u,v) = (= + (u- “= *) cos (5)) COS v, 
a(t cos (=) sin v 
2 2 2, , 


whose image is a Mobius strip. In this case, the oriented boundary is given by a 


gluing of 
oe Pe eT CG 
ood, v= i ar ea =) COS VU, 


400 12 Differential Forms 


oo Bi (v) — ((‘ = a + ~X cos (5)) COS V, 


r+R R-r (5) . 
= oe OG sin v, 


o 0a; (u) = (u, 0,0), 


0 0 B, (u) = (u, 0,0). 


Notice that in this case the last two curves are exactly the same. 


3. Consider the surface o : [0, x] x [0, 277] > R?, defined by 
o(¢,0) = (Rsingcosdé, Rsing sind, Rcos@), 


whose image is a sphere with radius R > 0 centered at the origin. In this case, the 
oriented boundary is given by a gluing of 

aoa, (0) = (0,0, R), 

oo By (8) = (0,0, -R), 

a oa, (¢) = (Rsing, 0, Rcos@), 

ao B, (6) = (Rsing, 0, —Rcos¢@). 


Notice that the first two curves are degenerated into one point, while the other two 
are equivalent to each other, with opposite orientations. Hence, for any choice of a 
vector field F, we will have f,, F - dé =0. 


12.10 Stokes—Cartan Formula 401 


Let us see now an example of an oriented boundary of a volume in R®. Let 
o :[0, R] x [0,7] x [0, 27] > R? be the volume defined by 


o(p,¢,0) = (psingcosé, psing sind, pcos¢), 


whose image is the closed ball, centered at the origin, with radius R > 0. The 
oriented boundary do is a gluing of six surfaces, 


do =(G0a,,008*,0 0a;,008,,000a,,00;), 
where, e.g., 


aoa, (¢,6) = (0,0,0), 

oo By (¢, 0) = (Rsing cos6, R sing sin@, Rcos¢), 
a oa; (p,0) = (0,0, p), 

oo By (p, 8) = (0,0, —p), 

o 0a; (p,) = ((R— p) sing, 0, (R — p) cos), 

o 0 Bi (p.d) = (psing, 0, pcos). 


Notice that the first surface is degenerated into a point (the origin), the second has as 
its image the entire sphere, the third and fourth are degenerated in two lines, while 
the remaining two are equivalent to each other, with opposite orientations. Hence, 
given a vector field F, we will have 


: F-dS= F-dsS. 
do oop 


12.10 Stokes—Cartan Formula 
Let us first state the following theorem. 


Theorem 12.17 Let f : O > R be a scalar function of class C'! anda : [a,b] > 
RN acurve whose image is contained in O. Then 


/ Vf -dl= f(o(b)) — f(o(a)). 


402 12 Differential Forms 


Proof Consider the function G : [a,b] > R defined by G(t) = f(o(f)). It is of 
class C!, and by the Fundamental Theorem we have 


b 
/ G'(t) dt = G(b) — G(a). 


Since G’(t) = Vf (o(t)) - a(t), the conclusion follows. | 


Remark 12.18 The line integral of the gradient of a function f does not depend on 
the chosen curve itself but only on the values of the function at the two endpoints 
o(b) and o(a). 


Example Given 


# y Zz 
Fy a a ee 
(x, y, 2) [i ety + 2p | 


and the curve o : [0,47] — R? defined by o(t) = (cost, sint, ft), we want to 
compute the line integral /) o V+ dé. Observe that F = V f, with 


1 


f@,y,2 =————_. 
Vx2+y24 22 


Hence, 


1 
[ F-ae= foun - foo) = ~~ =| 


Let us now state the following generalization of the Gauss theorem, where the 
important Stokes—Cartan formula is obtained. 


Theorem 12.19 (Stokes—Cartan Theorem—I) Let 1 < M < N —1.Ifwisan 
M-differential form of class C' defined on an open set O C RN, ando : I > RN 
is an (M + 1)-surface whose image is contained in O, then 


[aw=f Qo. 
oO do 


Proof Since 


12.10 Stokes—Cartan Formula 403 


and 
Le = I, Tooph? = [ i iF Lgl 
we have 
M+1 5 
Se ae Cha ee) 
M+1 


= weve f Tow — | T.0) = Too. 
tel Be af al 


Ifo is of class C2, then Tz is of class C!, and, applying the Gauss formula to T, a, 


we have 
/ Toa = f d(To0). 
al I 


[ado = | teaey= | ae. 
I I oO 


Hence, we have seen that 


i o= | Tro = f daa) = f de, 
do al I oO 


and the theorem is proved in this case. 
If o is not of class C2, let (on)n be a sequence of functions as provided by 
Proposition 11.61. Since they are of class C?, we know that 


/ do = [ @  foreveryn. 
On don 


Using Theorem 7.24 we see that 


[ao =iim | da, / w = lim a, 
oO a On do n d0n 


and the proof is thus completed. | 


But 


We now concentrate on some corollaries. 


404 12 Differential Forms 


The case M = 1, N = 2. We consider a 1-differential form 
w(x) = Fi (a) dx, + F(x) dx2, 
and we obtain the Gauss—Green formula. 


Theorem 12.20 (Gauss—Green Theorem) Let F : O — R? be a C!-vector field 
ando : 1 = [aj,b] x [az, bo] > R? bea surface whose image is contained in O. 


Then 
OF: oF 
[(-Fensan= f F-dé. 
o (OX, 0x2 ao 


Hence, if o is regular and injective on I, with deto’ > 0, then 


F- F 
/ (2-2)-/ Page. 
a(t) \OX1 (0X2 ao 


Example Consider the surface o : J = [0,1] x [0,27] > R? defined by 
o(p,0) = (Apcos8@, Bp sin@), whose image is an elliptical surface with semiaxes 
having lengths A > 0 and B > 0. Take the vector field F(x, y) = (—y, x). Since 


OF> 


6 ye Gaye 
Ta ee A, _ 
ox = oy a 


and (as for the disk) 


i Fede= F.-dt, 
do oopt 


the Gauss—Green formula gives us 
20 
iE 2dxdy = / (—Bsin@, Acos@) -(—Asin@, Bcos@) d@ = 22 AB. 
o(1) ) 


We then find the area of the elliptic surface: u(o (J)) = TAB. 
The case M = 1, N = 3. We consider a 1-differential form 
w(a@) = Fi (a@) dx, + Fo(x@) dx2 + F3(@) dx3, 


and we obtain the Kelvin-Stokes formula. 


12.10 Stokes—Cartan Formula 405 


Theorem 12.21 (Kelvin-Stokes Theorem) Let F : O > R? be aC!-vector field 
and o : [a,, b,] x [az2, b2] > Ra surface whose image is contained in O. Then 


[vxr-as= F-dt. 
oO do 


Verbally The flux of the curl of the vector field F through the surface o is equal 
to the line integral of F along the oriented boundary of o. 


Example Let F(x, y,z) = (—y, x, 0) and y : [0,27] > IR? be the curve defined 
by y(t) = (Keost, R sint, 0); we want to compute the line integral i F . dt. We 
have already seen how to compute this integral by the direct use of the definition. 


We now proceed in a different way, as follows. Consider the surface o : [0, R] x 
[0, 277] > R? given by o(p, 0) = (pcos8, p sind, 0). Observe that y = 0 o Bis 


sO 
[rae] Fede= [ Fedt= [vx Feds. 
y oopy do oO 


Since V x F(x, y, z) = (0, 0, 2) and 
oy BYR G6) 020.8) 
——— 9 x paar ery io — >) 9 o) 
ap? nae p 


we then have 
R 20 
[rae] i: (0, 0, 2) - (0,0, p) dO dp = 27 R?. 
y 0 0 


The case M = 2, N = 3. We consider a 2-differential form 
w(a@) = F\(a@) dx2 A dx3 + Fo(a@) dx3 A dx + F3(a@) dx, Adx2, 
and we obtain the Gauss—Ostrogradski formula. 
Theorem 12.22 (Gauss—Ostrogradski Theorem) Let F : O —> R? be a C'!- 


vector field anda : I = [a,, bi] x [a2, b2] x [a3, b3] > R? a volume whose 
image is contained in O. Then 


[v-Fan nds nds =f F-dsS. 
oO 00 


406 12 Differential Forms 


Hence, if o is regular and injective on 1, with deto’ > 0, then 


/ vir=[ F-d&. 
o(1) do 


In Intuitive Terms The integral of the divergence of the vector field F on the set 
V =a (J) is equal to the flux of F which exits from V. 


Example We want to compute the flux of the vector field 
FQ, y,2 = (+? +272, ? +y? +27ly, ?t+y?4+27k) 
through a spherical surface parametrized by n : [0, 7] x [0, 27] > R°, defined as 
n(¢, 8) = (Rsingcosé, Rsingsind, Rcos¢). 


Recall that 7 = 0 o B, where o : I = [0, R] x [0, x] x [0,27] > R° is the 
volume given by 


o(p, ¢%,9) = (psingcos6, psing sind, pcos¢) . 


Hence, 


[Feas= F-dS= F-dS= V-F. 
n oopy do o(1) 


Since V - F(x, y, z) = 5(x* + y? + 2”), passing to spherical coordinates we have 


20 cf R 
/ V-F= i i / (5p7)(p sing) dp do dé = 41 R>. 
o(1) 0 0 JO 


12.11 Physical Interpretation of Curl and Divergence 


The Kelvin—Stokes and the Gauss—Ostrogradski formulas permit us to interpret the 
physical meaning of the curl and the divergence of a vector field F in R*. We assume 
F to be defined on some open set O C R? and to be continuously differentiable. 


Curl Assume at first that V x F is constant. If we know its direction, we can take 
a plane orthogonal to it and on this plane a disk with radius r > 0 (and hence area 
sr”), which we can easily parametrize in polar coordinates, thereby obtaining a 2- 
surface o, : I —> R3. Moreover, we can choose this parametrization in such a way 


12.11 Physical Interpretation of Curl and Divergence 407 


that the normal unit vector v,, has the same direction as V x F. Then 
/ Vx F-dS=ar’||V x Fll, 
Or 
and, by the Kelvin—Stokes formula, 


IV x Fl=—> F-dt, 
ur do; 
providing us the length of the curl of F. We thus see that the length of the curl 
measures in some way the rotational contribution of the vector field F along the 
circle do0,. This explains the name “curl.” 

When V x F is not constant, let us fix a point x in the domain O and proceed 
much as we did earlier, taking a disk centered at a with radius r and parametrized 
by o, such that the normal unit vector vg, has the same direction as V x F (a). By 
continuity, if r > 0 is “small,” we can think of V x F as being “almost constant” 
on this disk. More precisely, we will have that 


|V x F (&)|| = lim aan F.-dt. 
r>0t Tr Jao, 
We thus interpret physically the intensity of the curl at a point a as a good measure 
of the rotational motion of the vector field along a circular trajectory centered at x 
with a very small radius r. 

If we do not know the direction of V x F (a) a priori, it can be determined as 
follows. Assume V x F (a) 4 0. For each plane passing through a we can take 
a disk centered at 2 with radius r on this plane and parametrize it by some o;, in 
such a way that the normal unit vector v,, varies continuously with respect tor > 0. 
We compute lim,_, 9+ s. f. ao, F - dé, and, from among all these quantities obtained 
for all those planes, we select the largest one. The plane attaining this maximum 
quantity will be the one orthogonal to V x F (a), and the unit vector lim,_,9+ Vo, 
will indicate the direction of V x F (2). 


Divergence Assume at first that V - F is constant. Let us consider the closed ball 
B,, with radius r > 0 (and, hence, volume amr), which we can easily parametrize 
in spherical coordinates by a 3-surface o, : J > R° such that det Jo, (uw) > 0 for 
every U € 1. Then 


Ap 
Vera me Var, 


whence, by the Gauss—Ostrogradski formula, 


1 
V:-F=q / F.-ds. 
zur 00; 


408 12 Differential Forms 


We thus see that the divergence provides us a measure of the flux, per unit volume, 
of the vector field through the spherical surface d0,. It will be positive if the 
flow mostly crosses the surface from the interior toward the exterior, and negative 
otherwise. This explains the name “divergence.” 

When V - F is not constant, let us fix a point z in the domain O and proceed as 
previously, taking a ball B, = B(a,r), centered at x, with radius r, parametrized 
by some o, with det Jo, > 0. By continuity, if r > 0 is “small, we can think of 
V - F as being “almost constant” on this ball. More precisely, we will have that 


1 
V- F(x) = lim, / F-dé. 
r—>0+ zur do, 


We thus interpret physically the divergence at a point x as a measure of the flux per 
unit volume of the vector field through a spherical surface centered at x with a very 
small radius r. 


12.12 The Integral on an Oriented Compact Manifold 


In this section, we return to the setting of differentiable manifolds. We would like to 
define the integral of an M-differential form over an M-differentiable manifold. 
There is a difficulty in doing this, however, due to some problems related to 
“orientation.” Once the definition is given, we will finally be able to obtain a Stokes— 
Cartan formula also in this setting. 

We want to define an orientation for .#, which will automatically induce one for 
dM as well. Given & € .@, leto : I > R® bea local M -parametrization, with 
o (0) = &. Since o'(u) has rank M for every u € J, we have that the vectors 


form a basis for a real vector space of dimension M, which will be called the 
“tangent space” to .@ at the point o(w) and will be denoted by 7o(u).W (in 
particular, if uw = 0, we have the tangent space Tz-Z). 

Now, once u € J is considered, the point o (1) will also belong to the images of 
other M-parametrizations. There can be a& : J > R% such that o(w) = G(v) for 
some v € J. We have seen how to change the orientation of such a o by a simple 
change of variables. Hence, we can choose these local M-parametrizations so that 
the bases of the tangent space To(u)M% = Ts(y)-M associated with them are all 
coherently oriented; this means that the matrix that makes it possible to pass from 
one basis to the other has a positive determinant. We will refer to such a choice as 
being “coherent.” 

A coherent choice of local M-parametrizations is therefore always possible, in 
a neighborhood of x. But we are interested in the possibility of making a global 
coherent choice, i.e., for all possible local M-parametrizations of .@. This is not 


12.12 The Integral on an Oriented Compact Manifold 409 


always possible. For example, it can be seen that this cannot be done for a Mobius 
strip, which is a 2-manifold. 

Whenever all the local M-parametrizations of .@ can be chosen coherently, 
we say that .@ is “orientable.” From now on we will always assume that .@ is 
orientable and that a coherent choice of all the local M-parametrizations has been 
made. We then say that .@ has been “oriented.” 

Once we have oriented .@, let us see how it is possible, from that, to define an 
orientation on 0.4. Given x € 0.@, letao : 1 > R® bea local M -parametrization 
with o(0) = @; recall that in this case, J is the rectangle [—a, a]! x [0, a]. 
Since 0.@ is an (M — 1)-manifold, the tangent vector space 7,0. has dimension 
M — | and is a subspace of 7,.@, which has dimension M. Hence, there are two 
unit vectors in 7;.@, which are orthogonal to 7;0.@. We denote by v(a) the one 
obtained as a directional derivative 32 (Q) = do(0)v for some v = (v1,..., vy) 
with vy < 0. At this point, we choose a basis [v" (a), ..., v“—) (a)] in Tz. W 
such that [v(a), v (a),..., v“—-) (a@)] is a basis of Tz.W oriented coherently 
with the one already chosen in this space. Proceeding in this way for every a, it can 
be seen that 0.@ is thus oriented: We have assigned to it the “induced orientation” 
from that of 7%. 

Assume now that .@, besides being oriented, is compact. Given an M-differen- 
tial form w defined on an open set O containing .@, we would like to define what 
we mean by integral of w on @. 

In the case where Ow” the restriction of w to the set .@ is zero outside the image 


of a single local M-parametrization o : J > RY, we simply set 


ee 


It can be proved using the same reasoning as in Proposition 11.62 that this is a good 
definition. 

In general, we saw in Theorem 10.39 that for every x € .@ there is a 
neighborhood A’ of x such that A’ -@ can be M-parametrized by some injective 
function o : J > R. Moreover, if a € 0.4, the M -parametrization o is such that 
the interior points of a single face of the rectangle J are sent on 0.4. 

For every x there is an open ball B(x, 0) contained in A’. We thus have an open 
covering of .@ made of these open balls. Since .W@ is compact, by Theorem 4.9 there 
exists a finite open subcovering, which we denote by Aj,..., An. Let d1,..., on 
be a partition of unity associated with this covering. Since each ¢x - Y 4 is zero 
outside the image of a single local M-parametrization, we can define the integral of 
won @ as 


[ord] wee 


410 12 Differential Forms 


It is possible to prove, using the same reasoning as in Proposition 11.64, that such a 
definition depends neither on the (coherent) choice of the local M-parametrizations 
nor on the particular partition of unity defined previously. 

We can now state the Stokes—Cartan theorem for manifolds. 


Theorem 12.23 (Stokes—Cartan Theorem—I]) /f w is an M-differential form of 
class C! defined on an open set O © RN and is an oriented compact (M + 1)- 


manifold contained in O, then 
/ do= / o 
M aM 


(provided the orientation on 0 is the induced one). 


Proof We consider the local parametrizations provided by the preceding argument. 
Let us first assume that a yi is equal to zero outside the image of a single local 


M-parametrization o : | > RN. Then 


/ ao= f do= f Qo. 
M o do 


We now have two possible cases. 


Case 1:0(1)N 0-4 = @. By the injectivity of o on an open set containing J, 
we have that the image of do is contained in the boundary of o (/) in the metric 
space .@. Hence, since it is continuous, we have that w must be equal to zero on 
all points of the image of do, implying that 


i o=0. 
do 


On the other hand, since w is zero on 0.4, we have that 


/ o=0, 
0M 


so the identity is verified in this case. 

Case 2: 0(1) 10.4 #4 @. We know that the interior points of a single face of 
I are sent on 0./@. For example, leta, : I) > R™ be that part of 9/ such that 
a (a; ()) Cc 0.4, indeed the only one. Then o o aohs RY is a local 
parametrization of 0.@, and ly 4“ is equal to zero outside its image, whence 


[ =| 
aM aoa, 


12.13 Closed and Exact Differential Forms 411 


On the other hand, since o : J > RV is injective, we have that the image of da 
is contained in the boundary of o (7). Hence, since it is continuous, we have that 
@ must be equal to zero on all points of the images of all the other faces 


aof,, goa, oof, , ae 


Then 


f=) ot | ot | ot | ots f o, 
do oo, oop, aout oop, Toa, 


showing that even in this case the identity holds. 


Consider now the general case. With the previously found partition of unity, 
Pk | 4 is equal to zero outside the image of a single local M-parametrization 
for every k = 1,...,n. Since 


St so=a(Soer] Aw=d(l)Ao=0, 
k=1 


k=1 


we then have 


dw = | tdo= [ dterot [40 
— d . — . — ; 
ae (Pk - @) ve 2) [ 


and the proof is completed. | 


12.13 Closed and Exact Differential Forms 


We are now interested in the following problem. Given an M-differential form a, 
when is it possible to write it as the exterior differential of another differential form 
@ to be determined? 

We assume 1 < M < N and thatw € FiO: RY). 


Definition 12.24 A M-differential form w is said to be “closed” if dw = 0; it is 
said to be “exact” if there is a (M — 1)-differential form @ such that dO = w. 


Every exact differential form is closed: If a = do, then dw = d(d@) = 0. The 
converse is not always true. 


412 


= 


2 Differential Forms 
Example The 1-differential form defined on R? \ {(0, 0)} by 


ol8. Y= x+ dy 


XxX 
d aa 
+ y? x24 y?2 


is closed, as is easily verified. Indeed, setting 


x 
Fi(Qx,y)= EA.) = ae 


aa 2s 
x2 4 y2? + y2? 


for every (x, y) (0, 0), we have 
OF 
—(x, y) =). 
dy 


Let us compute the line integral of its vector field F = (F\, Fr) on the curve o : 
[0, 27] + R?, defined by o(t) = (cost, sin): 


20 
[ra=f F(a(t))-o'(t) dt 
oO 0 
20 
= / (—sint, cost) -(—sint, cost) dt =2z. 
0 


Assume by contradiction that @ is exact, i.e., that there exists a C ‘function f : R? \ 
{(0, 0)} — R such that ar = F, and a = Fy». In that case, since 0 (0) = o (277), 
we should have 


[ F-ae= [ vp-ae= feoexy - soy =0. 


which contradicts the preceding computation. 


The Poincaré Theorem, stated in what follows, says that the situation of the 
preceding example can never happen if, for example, the open set O on which the 
differential form is defined has a particular shape. 


Definition 12.25 A set O is “star-shaped” with respect to a point & if, with each of 
its points x, the set O contains the whole segment joining & to 2, i.e., 


[x,x]={x+t(~—2x):te[0, 1} CO. 
For example, every convex set is star-shaped (with respect to any of its points). 


In particular, a ball, a rectangle, or even the whole space RY is star-shaped. Clearly, 
the set R* \ {(0, 0)} considered previously is not star-shaped. 


12.13 Closed and Exact Differential Forms 413 


Theorem 12.26 (Poincaré Theorem) Assume that O, an open subset of RN, is 
star-shaped with respect to a point &. For 1 < M < N, an M-differential form w 
of class C! defined on O is exact if and only if it is closed. 


To prove the theorem, we need a preliminary result. To simplify the notations, 
we can assume without loss of generality that 


% = (0,0,...,0). 


Consider now the set [0, 1] x O, whose elements will be denoted by (t, x), with 
x@ = (x1,..., xy). Let us define the linear operator K, which transforms a generic 
M-differential form a defined on [0, 1] x O in an (M — 1)-differential form K (a) 
defined on O in the following way: 


(a) Ifa(t, 2) = f(t, x) dt Adx;, \---Adx;,,_, (notice here the appearance of the 
term dt), then 


1 
K(a)(x@) = (/ f(t, x) ar) dxj, A+++ A dXiy_,- 


(b) Ifa(t, x) = f(t, ©) dx;, A--- A dxi,, (here the term dt does not appear), then 
K(a) = 0. 

(c) In all other cases, K is defined by linearity (for each component of a generic 
M-differential form @, the term dt may or not appear, and the two previous 
definitions apply). 

Moreover, we define the functions yw, € : O —> [0, 1] x O as follows: 


wW(x1,..-,xXn) = (0, %1,...,xN), E(x1,...,4n) = 1, x1,...,4N)- 
Lemma 12.27 Ifa is an M-differential form of class C! defined on [0, 1] x O, then 
d(K (a)) + K(da) = Tyga — Tya. 


Proof Because of the linearity, it will be sufficient to consider the two cases where 
the differential form a is one of the two kinds considered in (a) and (b). 


(a) Ifa(t, ©) = f(t, ©) dt Adx;, A---Adxj,_,, then by the Leibniz rule we have 


N 1 af 
d(K(a))(«) = (/ (ta) ar) dim A AX, A+++ A AXig 3 
m=1 Me 


414 12 Differential Forms 


on the other hand, 


af 


da(t, #2) = ap x)dt Ndt Ndxj, \++- N\dxiy_, + 


OXm 


N af 
+)° (t, ©) dxm Adt AN dxj, A+++ AN dXiy_, 
=1 


) 
af (t,@) dt AdxXm A dxi, A+++ A dXiy_,; 


N 


and hence 


N 1 af 
K (da)(a) =— )> (/ 5) ar) Ai: WAKE Re AAR 
m=1 Me 
= —d(K(a))(x). 


Moreover, since the first component of w and of & is constant, we have Tya = 
T;a = 0. Hence, the identity is proved in this case. 

(b) Ifa(t, x) = f(t, ©) dxj,A---Adxiy,, we have K (~) = O and hence d(K (a)) = 
0; on the other hand, 


a 
da(t, 2) = 6, x) dt Ndxj, \-++ AN dXiy + 


N 


0 
+> i, w)dim A dxj, A+++ A dXiy 5 
m=1 
and hence 
1 af 
K (da)(a) = vad x) dt) dxj, \-++ A dXiy 
0 
= (fC, x) — f(O, v)) dxj, A---Adxiy, - 
Moreover, 


Tea(x%) = fC, &) dé, (@) A+++ A d&iy (&) 


= f(1, xv) dxj, A---Adxiy, 


Tyo(x) = f (0, ©) dj, (@) A--- A diy (@) 
_ S(O, x) dxi, A+++ AdXiy - 


12.13 Closed and Exact Differential Forms 415 


The formula is thus proved in this case as well. a 
We are now ready for the proof of the Poincaré Theorem. 
Proof By linearity we may assume, for simplicity, that 
@(2) = fir,...iy (@) xj, A+++ A dXxiy - 
Let ¢ : [0, 1] x O > O be defined by 
P(t, X1,...,XN) = (tx1,...,txy). 


Consider Tg, the pullback transformation through @ of w, and set o= K (Tg). 
We want to prove that d® = w. Since a is closed, by the linearity of K we have 


K (d(Tg)) = K(Ty(dw)) = K(Tg(0)) = K(O) =0. 
By the preceding lemma, 
d@ = d(K (Tga)) 
= T; (Tyo) — Ty (Tg@) — K (d(Tg)) 


= Tz (Tyo) — Ty (Too) 


= Tg.60 — Tgoy. 


Since @ o & is the identity function and ¢ o y is the null function, we have that 
Too¢@ = w and Tg.y@ = 0, which concludes the proof. | 


Remark 12.28 Vf w(a@) = fi,,...ig (@) Axi, A+++ A dxiy, , then 
Tgo(t, ©) = fiy,.jiy CL) (xi, dt + tdxi,) A+++ A Xiydt +t dxiy,) 


M 
= fi,,iy C@) dxi, A+++ A dxXiy + 


M 
+0M TS (1) lai dt A di, A+++ A dx;, A+++ A diy), 


s=1 


and hence 


K (Tg) (x) = 


1 M 
= (/ igor ye oe, ar)ax, A+++ A dXxj, A+++ A dXiy- 
0 


s=1 


416 12 Differential Forms 
Thus, for a general closed M-differential form 


om= SS -finstgltdayAmindrgy, 


1<i, <---<iy<N 
an (M — 1)-differential form @ such that d® = w is given by 


O(a) = K (Tyo) (a) 


M 
>. Yet xi, . 


1<ij <--<iy<N s=1 


We consider now some corollaries that hold true for the case N = 3. We will 
always assume that x = (0, 0, 0). 
The case M = 1. AC!-vector field F = (F|, F2, F3), defined on an open subset O 
of R3, determines a 1-differential form 


w(x) = Fi (a@) dx, + Fo(@) dx2 + F3(x@) dx3. 


It is closed if and only if V x F = 0. In this case, the vector field is said to be 
“rrotational.” On the other hand, the vector field is said to be F “conservative” if 
there is a function f : O — R such that F = Vf. In that case, f is a “scalar 
potential” of the vector field F.! 


Theorem 12.29 IfO C R? is star-shaped with respect to the origin, then the vector 


field F : O — R? is conservative if and only if it is irrotational, and in that case a 
function f : O — R such that F = V f is given by 


1 
fa) = | F(ta)-adt. 
0 


Any other function f : O — R which is such that F = Vf is obtained from f by 
adding a constant. 


Proof The first part follows directly from the formula in Remark 12.28. Assume 


now that F = Vf = Vf, and set g = f — f. Then Vg = 0 on O, which is star- 
shaped with respect to the origin. Using the Fundamental Theorem for the function 


' Beware that in Mechanics it is often the function — f that is called “the potential.” 


12.13. Closed and Exact Differential Forms 417 


F(t) = g(ta), we have that 


1 


1 
g(x) = F(1) = F(0) +f F'(t)dt = eo) + f Vg(tx) - xdt = g(0) 
0 0 


for every x € O. Thus, f — f must be constant on O. a 


Example Consider the vector field F(x, y, z) = (2xz+y,x, x’), which, as is easily 
verified, is irrotational. A scalar potential is then given by 


1 
f@,y,2z= / ((2t7?x?z +txy)+txy+ t?x*z) dt=xy+ x72. 
0 


The case M = 2. AC!-vector field F = (F|, F2, F3), defined on an open subset O 
of R?, determines a 2-differential form 


w(x) = Fi (x@) dx2 A dx3 + Fo(a@)dx3 A dx, + F3(a@) dx, A dx2. 
This is closed if and only if V - F = O. In this case, the vector field is said to 
be “solenoidal.” We say that F has a “vector potential” if there is a vector field 
V = (V,, V2, V3) such that F = V x V. 
Theorem 12.30 If O © R? is star-shaped with respect to the origin, then the vector 


field F : O — R* has a vector potential if and only if it is solenoidal, and in that 
case a vector field V : O —> R° for which F = V x V is given by 


1 
via) = | t(F(tx) x x) dt. 
0 


Any other vector field V :O > R such that f=VvV~x V is obtained from V by 
adding the gradient of an arbitrary scalar function. 


Proof The first part is obtained by applying the formula in Remark 12.28. The 
second part follows from the fact thatif F = V x V = V x V, then, by the previous 


theorem, V — V is a conservative vector field, thereby completing the proof. | 


Example Consider the solenoidal vector field F(x, y,z) = (y,z,x). A vector 
potential is then given by 


2 


1 
1 
va.n2a= f t(ty, tz, tx) X (ty, dt = se xy, x7 — yz, y* — x2). 
0 


418 12 Differential Forms 


The case M = 3. A C!-scalar function f, defined on an open subset O of R%, 
determines a 3-differential form 


w(a@) = f(a) dx, A dx2 A dx3. 


This is necessarily always closed since dw is a 4-differential form defined on a 
subset of R>. 


Theorem 12.31 Jf O © R? is star-shaped with respect to the origin, the function 
f :O => Ris always of the type f = V- W, where W : O — R? is the vector field 


defined by 
1 
W(x) = (/ r f()dt) x 
0 


Any other vector field W : O > R3 such that f=Vv- W is obtained from W by 
adding the curl of an arbitrary vector field. 


Proof The first part follows from Remark 12.28. Concerning the second part, if 
V-W=V.-W, then, by the preceding theorem, W — W has a vector potential. The 
proof is thus completed. | 


Example Consider the scalar function f(x, y, z) = xyz. Then a vector field whose 
divergence is f is given by 


. 5 1 2, ps 2 
Wa,y,2= rxyzdt) (x,y,z)= ra YZ, XYZ, XYZ). 
0 


12.14 On the Precise Definition of a Differential Form 


We started this chapter with an informal treatment of differential forms. In this 
section, we will finally provide their precise definition. 

Consider, for every positive integer M, the set Qq(IR") created by the M-linear 
antisymmetric functions on R% with real values. These are the functions 


go RY x...xR* SR, 
a 


M times 


which assign to each M-tuple (v\)), ..., v”) of vectors in R% a real number 


gv, 2. wi). 


They need to be linear in each variable, i.e., 


12.14 On the Precise Definition of a Differential Form 419 


g(...,av) + bw, ...) =agl...,u,...)+b06..,w,...), 
and antisymmetric, i.e., 
g(..., 89, ...,0%, 0.) =-@6..,0%,...,0,...). 
Introducing the usual operations y + y and cg among functions, with c € R, the 


sets Qy (IR) may be considered as real vector spaces on R. We also adopt the 
convention that Qo(R%) = R. 


If we choose the indices i;,...,iy in the set {1,..., NM}, we can define the 
M-linear antisymmetric function dx;, i: It is the function that associates to the 
vectors 

y yi) 

od ee er ae ee 
(1) (M) 

UN UN 


the real number 


eee ie 

iM iM 

Note that whenever two indices coincide, we have the zero function. If two indices 
are exchanged, the function changes sign. Let us recall the following result from 
elementary algebra. 


Proposition 12.32 If 1 < M <N, the space Qy (RY) has dimension G) A basis 


is given by (dXi,,....iy)1<i)<-<iy<N- If M > N, then Qy (RY) = {0}. 


Proof Let us see that the elements dx;, 
linearly independent. Assume that 


ame iy» With | < i) <--- < iy < N, are 


y Oi, ,...iy AXiy,....im = 9 


1<i, <:-<iy<N 


for some real constants aj, ,...,i,,. We now fix the indices | < jj <---< jy <N 
and prove that aj, ,_., jy, = 0. Applying the preceding sum to the selected vectors of 


420 12 Differential Forms 


the canonical basis ej, ,... , €jy , we have that 
Sify +++ Siviu 
) Qi,,...iy det : =0 
1<i, <:--<iy<N 2 +a 
Sims aera Sim im 


(Here 6;; is the Kronecker symbol; it has a value of 1 if i = j, otherwise 0.) We 
see then that, since 1 < i) <--- <iy < Nandl < jj <--- < jy < N, 
all the determinants in the preceding sum are equal to 0 except the one with ij = 
ji. , iu = jm, which is equal to 1. Hence, aj, ,..., jy, = 0. 

itn remains to be proved that Q mu (RN ) is generated by the elements dx;,__i,,. Let 
y be an element of Qy(R" ). Then, 


N 
M 
gv, ..., v0) = (Da Up ex, + i eee J Ss Pets 
ky=1 


k=l 


N 


1 M 
PS p(ex,-- ky )Ug : Pay 
ky,...ky=l 


Since g is antisymmetric, in this sum we can assume that the indices are two by 


two distinct. Then the sum for kj,...,ky going from 1 to N can be determined 
taking the indices | < i; <--- < iy < N and considering all their permutations 
ig(l)>++++4o(m), Witho : {1,..., M} > {1,..., M}. We can thus write 
N 
1 M 
> ~(Ex,,-- 1 ky Ug ) ee 
ky,..nky=l 


= ; (1) (M) 
= » a P(Eig ays +++ Eigse) Yigg” Pigaany ’ 


1<i)<-<iy<N 0 


If we now reorder all terms €;, ,,), ..- , €i,;4,), taking into account that exchanging 
any two vectors the value of g changes sign, we obtain 
() (M)) _ : agi) 
GV ,...,8!) = Yo Es, Cine) 80 My Yraan? 
1<i, <---<iy<N o 


where &, denotes the sign of the permutationo : {1,...,M@} — {1,..., M}. By 
the very definition of the determinant, we thus have that 
(1) ot ) 


Ui, 222 UR 


ee, ...,u) = ye PCH si Ciydet | 2 22.3 : 
1<i, <--<iy<N qd) (M) 


12.14 On the Precise Definition of a Differential Form 421 


1.€., 


Qe Ss P(Ei,,---5 Ciy) AXiy,. iy - 


1<i, <:--<iy<N 
The proof is thus completed. | 


As an example, let us consider the case N = 3 and take a closer look at the 
spaces Q (IR?), 22(R) and Q3(R°). 

Consider Q;(R3), the space of linear functions defined on R>, with values in R. 
We denote by dx, dx2, dx3 the following linear functions: 


dxi:Juw]rReu, dxza:fuw)rewm, dx:] vw] Pr v3. 


The space Q (R3) has dimension 3, and (dx1, dx2, dx3) is one of its bases. 

Consider Q2(R*), the space of bilinear antisymmetric functions defined 
on R? x R* with values in R. It has dimension 3, and a basis is given by 
(dx1,2, dx, 3, dx2,3), where 


VI Uy , 
VI UV 
dx1,2: v2 |,| v5 = det ( ! ) = VjV) — 20), 
7 o v2 2) 
3 3 
/ 
"1 U1 Vv, 
dx13: v2]. | vy = det ( ') = VIV; — 30}, 
i a U3 U4 
3 3 
/ 
™ "1 U2 U5 
dx2,3: v2]. | > det ie a) = U2U3 — U3U4. 
3 
V3 v; 3 


It is useful to recall that 


dx\,1 = dx2,2 = dx33=0, 


dx2,1 = —dx12, x31 = —dx13, dx32 = —dx23. 


422 12 Differential Forms 


Consider 03 (R>), the space of trilinear antisymmetric functions defined on R? x 
R? x R>, with values in R. We denote by dx1,2,3 the following trilinear function: 


Vv] v4 vy Vy Vy UY 
dx1.2.3: v2]. [u].] vy > det | v2 vu} vy 
V3 U3 U3 U3 V3 V4 


Every element of the vector space (23 (R3) is a scalar multiple of dx ,2,3, hence the 
space Q23 (R3) has dimension 1. Recall that 


dx},2,3 = dx2,3,1 = dx3,1,2 = —dx3,2,1 = —dx2,1,3 = —dx1,3,2 
and, when two indices coincide, we have the zero function. 


Definition 12.33 Given an open subset O of R%, we call “differential form of 
degree M” (or “M-differential form”) a function 


o:0—> Qy(R*). 
If M > 1, once we consider the basis (dXj,,...iy)1<i <--<iy<N, the components 


of the M-differential form w will be denoted by fij,,...i, : O — IR. We will then 
write 


(2) = Fit ,.sin (@) AXiy,...1M + 


1<i, <:+-<iy<N 


Hence, the M-linear antisymmetric function w(x) is determined by the (j))-dim- 
ensional vector 


A 0-differential form is simply a function defined on O with values in R. 
We can then define the exterior product of two differential forms, as presented in 
Sect. 12.2. If we consider the particular case of the two constant differential forms 
w(@)=dx,, @(@)=dx., foreveryreO, 


we will have that 


(w@A@)(@%)=dx12, foreveryxeO. 


12.14 On the Precise Definition of a Differential Form 423 


We can then write 
dx, \dx2 = dx\.2. 


More generally, in view of the associative property of the exterior product, we can 
write 


dxi, A+++ Xin = dxi; 


The informal definition given in Sect. 12.1 is now completely justified. 


References Cited in the Book 


1. R.G. Bartle, A Modern Theory of Integration (American Mathematical Society, Providence, 
2001) 

2. Z. Buczolich, The g-integral is not rotation invariant. Real Analy. Exch. 18, 437-447 
(1992/1993) 

3. R. Henstock, Definitions of Riemann type of the variational integrals. Proc. London Math. Soc, 
11, 402-418 (1961) 

4. T.W. Korner, Fourier Analysis (Cambridge University Press, Cambridge, 1989) 

5. J. Kurzweil, Generalized ordinary differential equations and continuous dependence on a 
parameter. Czechoslov. Math. J. 7, 418-449 (1957) 

6. M. Spivak, Calculus on Manifolds (Benjamin, Amsterdam, 1965) 


Books on the Kurzweil-Henstock Integral 


1. R.G. Bartle, A Modern Theory of Integration (American Mathematical Society, Providence, 
2001) 
2. A. Fonda, The Kurzweil—Henstock Integral for Undergraduates. A Promenade along the 
Marvelous Theory of Integration (Birkhauser, Basel, 2018) 
3. R.A. Gordon, The Integrals of Lebesgue, Denjoy, Perron, and Henstock (American Mathemat- 
ical Society, Providence, 1994) 
4. R. Henstock, Theory of Integration (Butterworths, London, 1963) 
5. R. Henstock, The General Theory of Integration (Clarendon Press, Oxford, 1991) 
6. J. Kurzweil, Nichtabsolut Konvergente Integrale (Teubner, Leipzig, 1980) 
7. J. Kurzweil, Henstock—Kurzweil Integration: Its Relation to Topological Vector Spaces (World 
Scientific, Singapore, 2000) 
8. S. Leader, The Kurzweil—Henstock Integral and its Differentials (Dekker, New York, 2001) 
9. P.Y. Lee, Lanzhou Lectures on Henstock Integration (World Scientific, Singapore, 1989) 
10. P-Y. Lee, R. Vyborny, The Integral. An Easy Approach after Kurzweil and Henstock (Cam- 
bridge University Press, Cambridge, 2000) 
11. T.Y. Lee, Henstock—Kurzweil Integration on Euclidean spaces (World Scientific, Singapore, 
2011) 
12. J. Mawhin, Analyse: Fondements, Techniques, Evolution (De Boeck, Bruxelles, 1979-1992) 
13. R.M. McLeod, The Generalized Riemann Integral (Mathematical Association of America, 
Washington, 1980) 
14. E.J. McShane, Unified Integration (Academic, New York, 1983) 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 425 
A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978- 3-03 1-23713-3 


426 Bibliography 


15. W.E. Pfeffer, The Riemann Approach to Integration (Cambridge University Press, 1993) 
16. W.F. Pfeffer, Derivation and Integration (Cambridge University Press, Cambridge, 2001) 
17. Ch. Swartz, Introduction to Gauge Integrals (World Scientific, Singapore, 2001) 


Some Textbooks on Exercises 


1. M. Gémes, Z. Szentmikléssy, Mathematical Analysis - Exercises I, Edtvés Lorand University 
(Typotex Publishing House, Budapest, 2014). (Free pdf available online) 

2.M. Gémes, Z. Szentmikléssy, Mathematical Analysis - Problems and Exercises I, E6étvos 
Lorand University (Typotex Publishing House, Budapest, 2014). (Free pdf available online) 

3. T. Radozycki, Solving Problems in Mathematical Analysis, Part I. Sets, Functions, Limits, 
Derivatives, Integrals, Sequences and Series (Springer, Berlin, 2020) 

4. T. Radozycki, Solving Problems in Mathematical Analysis, Part II. Definite, Improper and Mul- 
tidimensional Integrals, Functions of Several Variables and Differential Equations (Springer, 
Berlin, 2020) 

5. T. Radozycki, Solving Problems in Mathematical Analysis, Part III. Curves and Surfaces, 
Conditional Extremes, Curvilinear Integrals, Complex Functions, Singularities and Fourier 
Series (Springer, Berlin, 2020) 

6. P. Toni, P.D. Lamberti, G. Drago, 100+1 Problems in Advanced Calculus A Creative Journey 
through the Fjords of Mathematical Analysis for Beginners (Springer, Berlin, 2022) 


A Complex conjugate, 27 


Absolute value, 26 Components of a function, 48 
Additivity of the integral, 184, 304, 319, 354 Convergence 
Adherent point, 36, 93 absolute, 207 
Almost everywhere, 312 in norm, 207 
Arccosine, 139 pointwise, 103 
Arcsine, 139 uniform, 103 
Arctangent, 140 Countable, 22 
Area counterimage of a set, xxi 
of a set, 307 Criterion 
of a surface, 364, 365, 373 asymptotic comparison, 208 


comparison, 208 
condensation, 210 


B Leibniz, 212 
Banach space, 101 Cross product, 30, 382 
Boundary Curl, 383 

of a manifold, 299 Curve, 288 

of a set, 38 Cylindrical coordinates, 348 


Bounded, 14, 95, 104 
Bounded from above, 14 
Bounded from below, 14 


D 
De |’ H6pital’s rule, 147, 149, 150, 153 
Cc De Morgan rules, xv, xviii 
Canonical basis, 49 mae a 135, 273 
hy product, 214 ie 
Se ae second, 136 


Cauchy sequence, 100 : : 
Chain rule, 276 Diffeomorphism, 286 


: : difference of sets, xviii 
ee aad ee Difference quotient, 127 
osed : 

Differentiable, 127, 255, 259 
Differentiable manifold, 299 

orientable, 409 

oriented, 409 
Differential, 255, 272 
Differential form, 377, 422 

closed, 411 

exact, 411 
Direction, 256 
Directional derivative, 256 


differential form, 411 
set, 36 
Closure, 36 
Cluster point, 64, 93 
codomain of a function, xix 
Compact set, 95 
complementary of a set, xviii 
Complete additivity of the integral, 318, 354 
Complete additivity of the measure, 308 
Complete metric space, 100 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 427 
A. Fonda, A Modern Introduction to Mathematical Analysis, 
https://doi.org/10.1007/978- 3-03 1-23713-3 


428 


Distance 
Euclidean, 32 
in a metric space, 33 
in RY, 32 
Divergence, 384 
domain of a function, xix 


E 

Equivalent M-surfaces, 361 
Euler formula, 221 

Euler’s number, 85 

Exact differential form, 411 
Exponential, 53 

Exterior differential, 380 
Exterior product, 378 


F 
Factorial, 5 
Flux, 386 
Formulas: 
binomial, 11 
change of variables, 69, 335, 339, 343 
Euler, 221 
Gauss, 395 
Gauss—Green, 404 
Gauss—Ostrogradski, 405 
Kelvin—Stokes, 404 
Stokes—Cartan, 402 
Taylor 
with integral remainder, 181 
with Lagrange remainder, 154, 266 
Fourier series, 226 
Function, xix 
analytic, 160 
arccosine, 139 
arcsine, 139 
arctangent, 140 
bijective, xx 
bounded, 104 
circular, 57 
of class C” or C"-functions, 263 
of class C” or C"-functions, 273 
of class C? or C2-functions, 263 
of class C! or C!-functions, 273 
of class C! or C!-functions, 259 
concave, 144 
continuous, 42, 43 
convex, 141 
cosine, 57 
decreasing, 51 
derivative, 135 
differentiable, 127, 255 


Dirichlet, 45 

even, XX 

exponential, 53 
hyperbolic, 60 
hyperbolic cosine, 60 
hyperbolic sine, 60 
hyperbolic tangent, 61 
increasing, 51 
injective, xx 
integrable, 165 
invertible, xxi 
L-integrable, 236 
logarithm, 53 
monotone, 51 

odd, xx 

R-integrable, 187 
sine, 57 

strictly concave, 144 
strictly convex, 143 
strictly decreasing, 51 
strictly increasing, 51 
strictly monotone, 51 
surjective, Xx 
tangent, 59 


G 

Gauge, 163, 303 
Geometric series, 204 
Gluing, 392 

Gradient, 383 

graph of a function, xix 


H 

Harmonic series, 205 
Hessian matrix, 263 
Homothety, 346 


I 
Image of a M-surface, 288 
image of a set, xx 
Induced orientation, 409 
Induction, 3 
Inequality 
Bernoulli, 6 
Schwarz, 31 
triangle, 28,32, 33 
Infimum, 15 
Integrable function: 
Kurzweil—Henstock, 165, 303 
Lebesgue, 236, 304 
Riemann, 187, 304 


Index 


Index 


Integral 
of a function, 166, 303 
line integral, 385 
surface integral, 386 
Integral function, 173 
Integration 
by parts, 177 
by substitution, 179 
Interior, 35 
Internal point, 35 
intersection of sets, xviii 
Interval, 19 
Trrotational, 416 
Isolated point, 35 


J 
Jacobi Identity, 30 
Jacobian matrix, 273 


L 

Lagrange multipliers, 294 
Leibniz rule, 321, 354 

Length of a curve, 363, 365, 373 
Limit, 63 

Line integral, 385 

L-integrable, 236, 304 
Lipschitz continuous, 334 
Local diffeomorphism, 286 
Logarithm, 53 


M 

Mathematicians: 
Banach, 101 
Bernoulli, 6 
Bolzano, 50, 95 
Cantor, 20 
Cartan, 402, 410 
Cauchy, 100, 146, 214 
Cesaro, 226 
Chebyshev, 312 
de 1’H6pital, 147 
De Morgan, xv, xviii 
Descartes, 34 
Dini, 270 
Dirichlet, 45, 228 
Euler, 85 
Fejer, 227 
Fermat, 136, 267 
Fourier, 226 
Fubini, 326, 331, 355 
Gauss, 395, 404, 405 


429 


Green, 404 

Heine, 99 

Henstock, 165, 233, 235 

Hess, 263 

Jacobi, 30, 273 

Kelvin, 404 

Kurzweil, 165 

Lagrange, 138, 266, 294 

Lebesgue, 353 

Leibniz, 179, 212, 321, 354 

Levi, 241, 352 

Lipschitz, 334 

Mertens, 214 

Napier, 85 

Ostrogradski, 405 

Pascal, 10 

Peano, 3 

Poincaré, 413 

Riemann, 302, 304 

Rolle, 137 

Saks, 233, 235 

Schwarz, 31 

Stokes, 402, 404, 410 

Taylor, 154, 160, 181, 266 

Weierstrass, 95, 98 
Maximum 

of a function, 98 

of a set, 14 
M-dimensional measure, 365 
Mean Value Theorem, 279 
Measurable, 307, 351 
Measure, 307, 351 

M-dimensional, 365 

M-superficial, 365 
Metric space, 33 
Minimum 

of a function, 98 

of a set, 14 
M-manifold, 299 
Modulus, 26 
M-parametrizable, 291 
M-superficial measure, 365 
M-surface, 288 


N 

Napier’s constant, 85 
Negligible, 310 
Neighborhood, 35 
Nonoverlapping, 301, 318 
Norm, 30 

Normal unit vector, 289 
Normed vector space, 32 


430 


O 
Open 
set, 35 
Order relation, 12 
Orientation 
induced, 409 
Oriented boundary 
of a M-surface, 397 
of a rectangle, 393 
Orthogonal vectors, 30 


P 
Parallelogram identity, 32 
Parametrizable, 291 
Parametrization, 291 
Partial derivative, 256 
Partition of unity, 371 
Point 

adherent, 36, 93 

cluster, 64, 93 

internal, 35 

isolated, 35 
Polar coordinates, 347 
Potential 

scalar, 416 

vector, 417 
Primitivable function, 172 
Primitive of a function, 172 
Product 

cross, 30 

scalar, 29 
product of sets, xviii 
Projection, 47 
Projection of a set, 331, 355 
Proof by induction, 6 
Pull-back, 390 


R 

Ratio test, 210 
Rectangle, 288, 301 
Recursion, 4 
Reflection, 345 

Regular M-surface, 288 
Riemann sum, 162, 302 
R-integrable, 187, 304 
Root test, 209 
Rotation, 346 


N) 
Scalar product, 29, 383 
Second derivative, 136 


Index 


Section of a set, 331, 355 
Separation property, 14 
Sequence 
Cauchy, 100 
sequence, xix 
Series, 203 
Cauchy product, 214 
Fourier, 226 
geometric, 204 
harmonic, 205 
Taylor, 160 
Sets: 
bounded, 14, 95 
bounded from above, 14 
bounded from below, 14 
closed, 36 
compact, 95 
countable, 22 
measurable, 307, 351 
M-parametrizable, 291 
negligible, 310 
nonoverlapping, 318 
open, 35 
parametrizable, 291 
star-shaped, 412 
Sign permanence, 66 
Solenoidal, 417 
Space 
Banach, 101 
compact, 95 
metric, 33 
normed, 32 
Spherical coordinates, 349 
Square root, 18 
Square roots in C, 25 
Star-shaped set, 412 
Subsequence, 94 
Summation, 5 
Supremum, 15 
Surface, 288 
Surface integral, 386 


T 
Tagged partition, 301 

6-fine, 303 
Tangent, 59 

line, 127 

plane, 289 

space, 408 

unit vector, 288 
Taylor polynomial, 154 
Taylor series, 160 
Telescopic sum, 8 


Index 


Theorems: 
Bolzano, 50 
Bolzano—Weierstrass, 95 
Cantor, 20 
Cauchy, 146 
Change of Variables, 335, 343, 358 
Cousin, 163, 303 
de l’H6pital, 147, 149, 150, 153 
Dirichlet, 228 
Fejer, 227 
Fermat, 136, 267 
Fubini, 326, 329, 331, 355 
Fundamental Theorem, 170, 172, 200 
Hake, 247 
Heine, 99 
Implicit Function, 270, 280, 285 
Lagrange, 138 
Lebesgue, 246, 353 
Levi, 241, 352 
Local Diffeomorphism, 286 
Mean Value, 279 
Poincaré, 413 
Rolle, 137 


431 


Saks—Henstock, 233 

Schwarz, 261 

Sign Permanence, 66 

Squeeze, 67 

Stokes—Cartan, 410 

Weierstrass, 98 
Transformation of a differential form, 390 
Translation, 344 
Triangle inequality, 28, 32, 33 


U 
union of sets, xviii 
Unit vector, 256 


Vv 

Vector field 
irrotational, 416 
solenoidal, 417 

Volume, 291 
of a set, 307 


